The NSA headquarters in Fort Meade, Maryland.
The NSA headquarters in Fort Meade, Maryland. Wikimedia Commons
SHARE

Machine learning algorithms used by the U.S. National Security Agency to identify potential terrorists in Pakistan may be ineffective, because we just don’t have enough data to tell the signs of a terrorist, claims an investigation by Ars Technica UK.

The NSA project, disastrously named Skynet, uses cellular network traffic in Pakistan to identify and monitor potential threats, according to leaked documents on The Intercept. Like many machine learning algorithms in big data, it takes millions of values as input, and tries to match certain patterns. This was revealed by the Intercept in 2015, but the Ars investigation dives into how ineffective the program could really be.

This is much like the machine learning used by tech companies today to govern most of what we see online. Facebook uses machine learning to rank your news feed, and Google has started to use it in search.

But these techniques only work reliably if the machine is initially trained with many, many examples of what the correct pattern looks like. In this case, that correct pattern could include locations, behavior like excessively swapping cell phone hardware, and only receiving calls, not placing them. Patrick Ball, director of research at the Human Rights Data Analysis Group, told Ars Technica that the data used is too vague for any reliable outcome.

“First, there are very few ‘known terrorists’ to use to train and test the model,” Ball said. “If they are using the same records to train the model as they are using to test the model, their assessment of the fit is completely bullsh*t.”

The Skynet project uses data from just seven known terrorists.

Ball says that to test their model, the Skynet project uses data from just seven known terrorists, plus a random sampling of 100,000 mobile phone users. To test their algorithm, the NSA shows it six of seven known terrorist patterns, then all of the normal patterns, and then tasks the algorithm with finding the seventh terrorist pattern hidden somewhere in the noise. These calculations are made on 80 variables about each cell phone user, and the NSA has records on 55 million users, according to the NSA presentation. This is contrasted to more than 180 million citizens of Pakistan, making the data incomplete at best.

“Incomplete at best” is also a great way to describe the outputs. The NSA can get a .18 percent rate of false alarms, if they miss half of all potential matches. One slide literally says, “statistical algorithms are able to find the couriers at very low false alarm rates, if we’re allowed to miss half of them.” With 55 million records searched, about 99,000 hits would be false positives.

But all this information rests on slides that might be from 2011 or 2012. We also have no idea about how these might be have been refined, or thrown out, or are being used today as they potentially were in 2011 with little oversight. The slides might be false. (That’s probably not the case, but it’s possible.) The NSA could actually have far more than 55 million records now.

And it should also be noted that we have no idea what the NSA is actually doing with this data. It could be funnelled into reports to inform drone strikes, although it would seem the government isn’t treating every positive match as a threat, despite the alarming 3,994 people killed by U.S. drone strike in Pakistan since 2004.

Giving algorithms this much power isn’t a big deal if it’s tagging Facebook photos, or determining who to show an advertisement to, but such a wide margin of error is deadly when lives are on the line.

“It’s bad science, that’s for damn sure,” Ball said.