The Man Who Lit The Dark Web

Data-mining tools are helping cops bust open online human trafficking

Share

Before Chris White could help disrupt Jihadi finance networks, crush weapons markets, and bust up sex-slave rings with search tools that mine the dark Web, he first had to figure out how to stop himself from plummeting through the open gun door of a banking Black Hawk helicopter.

No hand-holding in a war zone, he thought.

It was September 2010. White was on his way to a forward operating base outside Kabul headquarters, as part of a secret intelligence cell to help confront the Taliban and al-Qaida, smash their encrypted online money stream, and win over the hearts and minds of the Afghanistan population.

Slight and lanky and 28, White felt Dukakis-ridiculous in his unwieldy body armor and bulbous helmet with “Dr. White” scrawled in marker on duct tape across the front, and with the dust from liftoff, he was finding it hard to breathe. He was still struggling with the unfamiliar seat straps when the pilot hit the stick, sending White sliding toward the hot square of the door and the desert 200 feet below.

Down there, Afghanistan was a messy, dangerous place for pretty much everybody. After nearly a decade of U.S.-led war, the American body count had hit 1,000, and civilian casualties were beyond calculation, as President Obama’s 30,000-troop surge intensified the fighting that spring. Many feared the situation was only going from bad to worse. The U.S. was escalating drone strikes across the border in Pakistan. And U.S. command was under assault after Gen. Stanley McChrystal, the surge’s architect, found himself without a job after he and his staff made disparaging remarks about the commander in chief in some music magazine.

It is hard to imagine that only a few weeks earlier, White had been just another impossibly young-looking Harvard postdoc in flip-flops looking forward to a Cambridge summer. Helicopter gunships and war zones weren’t on the radar; there were lattes in the square and rock climbing, and on the other side of campus, a prestigious fellowship in the School of Engineering and Applied Sciences, where he was working at the intersection of big data, statistics, and machine learning. He had earned academic pole position and had every expectation it would continue that way forever — becoming a professor, building a lab, and sniping out white papers from a tenured ivory tower.

But then his mentor asked him to attend a weekend conference at DARPA. White knew it as the alphabet soup that spelled out Defense Advanced Research Projects Agency, the Pentagon’s scientific-innovation department, the folks who brought you bionic exoskeletons, night vision, the M16, agent orange, GPS, stealth technology, weather satellites, and the Internet. DARPA projects combined smart people, big ideas, and big government dollars. Their goal was to help the nation prevent technological surprise, and every five to 10 years, wheel out world-changing tech with a strategic edge.

White had gone grudgingly, expecting a PowerPoint presentation, a recruiting speech, “and maybe some theoretical question like you’d expect from DARPA — you know, see if we can build some giant laser,” White says. Instead, he got a top-level briefing on the world at war. He learned there were dark forces out there. Their acts were brutal, but their tactics and bureaucracy were sophisticated. They were killing and terrorizing, growing and winning. He also heard there was an opportunity to use big data to counter those forces; his country was eager to seize that advantage as quickly as possible.

By the end of a full day, White the wunderkind postdoc felt humblingly naive.

I don’t know anything about war, he thought. White had never been privy to the details from a practical, operational perspective. Increasingly, that perspective involved a need to make sense of gargantuan icebergs of raw and seemingly unconnected data, to pull plans and policies out of frozen mountains of intel.

America, it turned out, could use a guy like White in a war zone.

But first, he had to stop himself from plummeting through that chopper door. White scrabbled back to his seat, grabbed the straps, and held on as gunners slouched in the open door, watching for ground fire. These veteran warriors were like characters out of Mission: Impossible, White thought.

White was on their team but with a different role, as part of a nerd A-team in a classified DARPA program called Nexus 7. For nearly a decade, the U.S. military had been collecting intel in Afghanistan, reportedly courtesy of the CIA, the National Security Agency, GPS satellites, cellphone records, battlefield reports, digital financial streams, surveillance cameras, foreign intercepts, and fire-hose streams from every online social network out there. While this intel had been useful — for, say, a targeted drone strike — it mostly amounted to a data dump. And there was even more that the U.S. wasn’t utilizing in its quest to understand what Afghanistan’s citizens wanted and needed. These overlooked clues were, as Maj. Gen. Michael Flynn, then-head of U.S. intelligence in Afghanistan, put it, a “vast and underappreciated body of information.”

To fix that, DARPA had sent in White and a dozen other geeks to embed with fighting units and make better use of this data trove. Some of the geeks would fuse things like satellite data and on-the-ground surveillance to visualize how traffic flowed (or didn’t flow, indicating a nearby Taliban checkpoint or a roadside bomb). White’s team mission was to target the digital trail of the Taliban and al-Qaida’s financing.

Their data-mining tools were specific to the needs of the war, and successful enough to garner him promotions, medals, and citations. Eventually, White would take these tools and the lessons he learned back home, where they would help revolutionize criminal investigative work, lend a hand to the journalists probing massive downloads like the Panama Papers, and shine light into the dark data realm where drugs, guns, and human beings are bought and sold, and where illicit bitcoin billions flow freely. One day soon, they might even help pave the way for a more informed democracy.

Sliding toward that Black Hawk’s open door, White assumed it was the end. It was only the beginning.

Chris White
Chris White in Afghanistan in 2010, part of a data-mining nerd team. courtesy Chris White

White is not a stuttering, Beautiful Mind type of genius. He’s more of a stealth nerd. I first met up with him this past November in the lobby of a hotel in downtown Seattle. The lithe and darkly handsome Oklahoman I found in a bright blue Patagonia windbreaker by the front desk came across as something like a smaller, quieter hipster Carl Sagan. Which is to say he’s not just bright and passionate, but he’s also nice and strangely normal — qualities that might seem at odds with his role as anointed visionary whiz kid. But apparent contradiction is White’s secret sauce: He’s an accomplished Ashtanga yoga practitioner who has been to war, a former government employee on a first-name basis with celebrity Buddhists and legendary hackers, and a practiced martial artist who’s dedicated to the solitary sit-down science of staring at computer screens.

These apparent contradictions have allowed White, now 34, to bridge worlds between experts. He’s not the genius cranking out code, the analyst looking for the next big IPO, the hand-shaking CEO, or the wartime general turning a pile of intel into a plan. He’s the guy who can talk to all of those people, understand them, and combine their strengths into a matrix none individually would have imagined.

Currently, that matrix has to do with making the Internet a more interesting, useful, and democratic tool for exploring our data universe. And it turns out, that’s not a career he could plan for. Post high school, White had surprised classmates by veering into the hard sciences. He then surprised his family and himself by abandoning a pre-med track for electrical engineering. He continued to surprise them with his facility for statistics and computer science, leading to a rarefied academic byway where machine learning and big data intersected with human language.

“Some of the best minds of our generation are using the Internet to make advertisers richer,” White says. “But the connectivity of the Internet is also an unprecedented mechanism for compassion, for understanding each other, ourselves, and our world. What could be more interesting than that?”

But by the time White traveled from his Harvard postdoc to that DARPA briefing, he had already parlayed an electrical engineering degree from Oklahoma State University into a fellowship from the Department of Homeland Security, and earned his Ph.D. at the Center for Language and Speech Processing at Johns Hopkins University. He’d also worked with Microsoft, MIT, IBM, and Google. And, he says, none of that had prepared him for what he calls the “no-kiddingness” of the mission in Afghanistan.

“I was blown away,” says White. “It was scary, and it was stressful, and I was really intensely focused on the work. I knew I was contributing to something important. But I had no idea that I was making a radical life change.”

a gun

At the time, DARPA was changing too. Its new director, Regina Dugan, had shepherded Nexus 7 through the Pentagon bureaucracy. She believed in the power of crowdsourcing complicated problems and wanted DARPA to take on a more active wartime role, rather than blue-skying technologies that might remake the military 10 years down the road. As she had told a Congressional panel, she wanted military leaders to know DARPA was in the fight.

Nexus 7 would be the tip of the spear. The effort was designed by DARPA project manager Randy Garrett, overseen by Dugan, and greenlit by Gen. David Petraeus. The teams were split into two groups totaling about 100 computer scientists, social scientists, and intelligence experts. The larger group remained stateside, writing code and mashing up military data sets; White was in the smaller group, looking over shoulders in military HQ tents in Afghanistan.

The Taliban and al-Qaida were military organizations committing atrocities in the name of Allah, but increasingly they operated like criminal organizations that ran not on religion, but money. That money paid for every bullet and bomb, kept troops together and villages friendly, and bought information and protection, vehicles and fuel, hearts and sometimes minds.

Like any criminal operation, most of that money came from criminal activity: physical theft, or the sale of wares such as weapons, drugs, and, increasingly, human beings for ransom, slavery, or sex.

Those transactions, and the profits from them, were hidden and laundered through legitimate businesses and shell corporations. Some of this happened in the physical world — real drugs, real people, real wads of cash money. But increasingly, that criminal activity — everything from the buying and selling of wares via the dark Web and social media, to the filtering of proceeds through bitcoin transactions and encrypted accounts — could be carried out more easily online, in the same digital world White had spent his career studying.

The coalition generals in Afghanistan had known this for years, but that didn’t mean they knew all the details. Nexus 7’s larger role was to find useful needles in the haystack of U.S. intelligence — including anything that could help the generals better understand the needs of the Afghan people. White’s team focused on the source of the money, the guns, the drugs, and the human sex-traffic, figuring out where and why these transactions took place and who was involved. White played middleman between the DARPA teams coding stateside and the needs of the military commanders in Afghanistan.

“Unfortunately, that meant a lot of cold calling, a lot of asking for meetings from these big commanders. It was really stressful,” White says. “I’m not really sociable. But I knew I had to just swallow that because that was the job.”

Getting into conversations with people in a war zone who didn’t know or care why White was interrupting their job was a learning curve steeper than a Black Hawk’s takeoff, and a waking anxiety nightmare. White didn’t talk crap or sports — or, frankly, particularly like people at first. Worst of all, he was a civilian. He had no military uniform, military training, or military rank — the shorthand on the collar or sleeve for who needs to make time for whom.

“One thing about war,” White says, “is people are really busy.”

He didn’t even have a particularly military bearing. While other guys pumped iron, the lithe little yoga dude they called Dr. Spaghetti Man was stretching and breathing on the wrestling mats, an Ivy Leaguer downward-dogging in a world of booyah. Gradually, as he extended his stay from nine days to 90, and then signed on for more stints in the country over the next year and change, he became DARPA’s senior in-country lead in charge of Nexus 7, and a citizen of this military world. He learned to invoke the “Dr.” early and often, learned that the embarrassingly fancy watch his dad had given him worked like stars and bars in the government dress code. And he learned that using martial-arts skills to put big guys on their asses during rec time made a positive impression, and turned fighting men into friends. It also helped White and his team do their jobs. The specific metrics are classified, but the presidential reports and citations are clear: Nexus 7 made a meaningful contribution to the war for hearts, minds, and lives.

By the end of his time in Afghanistan, Nexus 7 had earned the respect of the commanders too, and Dr. Spaghetti Man held a DARPA rank equivalent of a one-star general. Nexus 7’s efforts also gained citations and medals from the Department of Defense and the Department of the Treasury. Among other things, White’s team was commended for creating the “large data analytic framework” that provided “unique and valuable insights against key strategic and operational questions.” White personally was cited as a credit to the agency.

But all the lacquer and ribbon came at a cost. Chris White was no longer the same wide-eyed postgrad who had boarded a jet to Kabul. “By the end, I’d dropped out of Harvard and lost my long-term girlfriend,” White says. But most changed was his view of the world.

White wouldn’t say he was shell-shocked. He hadn’t been battering doors and stepping on strange earth loaded with explosives. But for the first time, he’d seen what the enemy — what people — were capable of.

The job was over; it was time to move on from the war. But White felt he wasn’t ready to leave every battle behind. He would soon get the chance to take one battle beyond the boundaries of war.

The data White had helped track had led the people who risked their lives toward places where women and children were traded as commodities, and White had seen firsthand how vulnerable those women and children were. He also learned that those crimes didn’t exist in Afghanistan alone. And it didn’t take a plane to find them; it took a modem.

The Internet you know is not the Internet. Or not all of it. To start, there’s the Internet of Bing, Google, Firefox, and Siri—the places where your Gmail and bookmarks live, where you find cat litter and football scores. That’s said to represent over 200 terabytes of data, more than if you digitized all the printed material in the Library of Congress. That’s a lot of reading, but it’s not the Internet; it’s just the surface.

Estimates vary, but the “surface” Web, or open Web, represents between 5 and 20 percent of what’s out there. The rest resides in places that most crawlers can’t reach or index. Some data are “deep,” in password-protected places like social media and message boards, or in increasingly common dynamic websites—which are more like apps than pages from a book, and change when you interact with them, like Kayak. The rest of the Web is “dark.”

But the dark Web isn’t a road you’ve neglected to drive down on your way to amazon.com. The main tool of access is Tor (originally an acronym for The Onion Router). Onion routing was first developed by the U.S. Naval Research Lab to ensure secure intelligence communication. It bounces encrypted information through a series of anonymized nodes, rendering it virtually untraceable, letting you browse a Web you wouldn’t want cookies and targeted ads to track—and creating a haven for those who fear surveillance and authoritarian control.

The dark Web does not discriminate among government users, savvy cyber libertarians, planning boards for ISIS, whistle-blowing hacktivists, or Arab Spring planners. Its free markets are unregulated, and specialize in goods that need to be bought and sold anonymously. In the dark, you’re always only three clicks from the illegal, repulsive, or violent, or, more often than not, from sharing a jail cell with Jared from Subway. You can probably find China White heroin, fake E.U. and U.S. passports, nonsequential supernote Benjamins, Peruvian flake, DMT, Hard Candy, Pink Meth, and dump sites for hacked nude celebrity selfies.

If you’re one of the estimated 2.5 million daily visitors to this dark world, you’re laughing unkindly (or trashing this description online). No dark-Web catalog can ever be complete or correct. This game of Whac-A-Mole is liberating for some, frustrating for others. It’s also a perfect landscape for criminal organizations and terror groups to communicate, advertise, or buy or sell anything, including human beings. As you read this, an estimated 21 million people are being trafficked around the planet. More than half are women and girls. More than 1 million are children. Nearly one-quarter are bought and sold as sex slaves. Only 1-in-100 victims of human trafficking is ever rescued. It’s a booming business. High profits and low risk make human trafficking one of the fastest-growing and most lucrative crimes on the planet; the U.N. recently estimated that trafficking nets $150 billion a year.

And as a business, it differs negligibly from the sale of kitty litter or crew-neck sweaters; in order for consumers to buy your product, they have to be able to find it. While the makers of Tidy Cats can take out a billboard, human traffickers need to be visible enough that their customers can find them, but hidden enough that they can’t be tracked down by authorities. Not surprisingly, that puts the majority of sex-traffic data in the deep or dark Web, or hidden in plain sight in the terabytes of the surface Web, in ways quite different from legal businesses that want to be found by consumer Web-search engines.

The exact formula for how search engines like Bing and Google rank results is governed by secret algorithms mere mortals aren’t allowed to know. But two factors dominate: Pages linked by other pages are ranked higher, as are pages with keywords matching the search terms. That’s what puts Wikipedia pages at the top of most Google searches—they cite, and are cited by, numerous other lesser sources (such as blogs). But sex traffickers don’t want to be found via Web search. To throw off the index, they advertise through one-off ads, unlinked to others. They hide deep in chat rooms or uncrawlable social-media posts. They avoid search-engine optimization. Instead of keywords, they use photos and code words. At this moment, there are likely hundreds of thousands of active ads for sex for sale on the Internet. Detectives using regular search engines have an extremely difficult time finding these or making cases against criminals who don’t play by Google’s rules.

Chris White was given the chance to change the rules.

Once upon a time, White had traded a safe academic track for an intellectual military adventure. Two years later, both were over, and at 30 years old, he had to make a new path. From the outside, his life might have seemed a logical progression. But to White, it was as if he’d fallen down a rabbit hole and come out the other side. And then DARPA offered him the position of program manager. He’d found his Wonderland.

“Once, I’d wanted to ‘be a thing,’” White says—a respected position like a doctor or a primary investigator. “But now I realized I wanted to ‘do a thing.’”

As a DARPA program manager, White could name his project. And the “thing” he wanted to make was a new breed of search engines, capable of mining the entirety of the Internet.

In Afghanistan, there were few off-the-shelf tools for mining big data or visualizing the results; they were built mostly for experts and for specific projects. But what if they could build off-the-shelf pieces and make them available to everyone? A sort of Erector Set of super-search-engine pieces that you could assemble any number of ways.

The result was initially a three-year—and reportedly up to $50 million—project to construct that search-engine Erector Set: a suite of perhaps 20 new super-search-engine parts, coded by 17 different units from private industry and universities, and dedicated to providing better ways of interacting with and understanding the data available on the whole Internet, in ways farther reaching and more transparent than anything possible with Firefox, Safari, Google, or Bing.

They called it Memex—a name combining “memory” and “index”—borrowed from a 1945 article by the visionary former director of the Office of the Scientific Research and Development, Vannevar Bush. Memex would be a tool to visualize connections between ideas and facts. If it worked, it could empower human researchers with superhuman insights.

As White explains, data on the Internet is essentially descriptions of what happened in the real world—photos, emails, blogs, phone calls, GPS trails, and social-media posts. “The goal of an investigator is to dig through the descriptions and work backward,” White says, “to understand that real-world event.”

With a traditional Web browser, that’s no easy task.

Type a search term—such as a phone number—into Google, and you might get 20,000 results, links to pages from the surface Web, ranked in order of keyword hits and the number of hyperlinks each page has to and from other pages. Your only option is to click through those results one by one, checking each page for the single answer you are hoping to find. That’s fine for discovering facts such as “What is the capital of Montana?” But for complicated investigations, White likens it to using a push mower to mow a golf course. “It’s sequential and prone to error,” he says. “There are better ways.”

White’s Memex project would be a portfolio approach. Some tools would dive into the dark Web and present all the hidden onion sites to be found there as a list, something previously considered too difficult to bother with. Others would index and sort the enormous flows of deep and dark Web online forums (which are otherwise unsearchable). Others would monitor social-media trends, connect photos, read handwritten information, or strip out data from Web pages and cross-index the results into data maps.

In theory, White’s search-engine Erector Set could be useful for any number of real-world applications; as a DARPA project, they needed to prove it could be effective for at least one. Ideally, that test application would attack a real-world data-rich problem that could help investigators make the world a better place, and the country safer.

White decided to focus the Memex test application on helping American law enforcement target a crime he’d been shocked to learn about in Afghanistan and found “inherently horrible”: the buying and selling of human beings.

On a computer screen in the Memex lab in Arlington, Virginia, Wade Shen, its current program manager, demonstrates how some of the Memex tools have been tweaked for sex-traffic investigating. The first is Datawake. Normally, a detective following a lead (for example, an email associated with a prostitute) plugs that info into Google, gets no exact matches but perhaps 25,300 results, and might open only a few of those before spotting a potential new clue and plugging that into the search bar instead, and moving on. Searching the entire 25,300 hits this way would take a detective two weeks of 12-hour shifts.

Datawake combs those same Google results, pulls the information off the pages, and organizes it visually. On-screen, the results appear as a series of circles. Lines between the circles indicate connections between data—names, phone numbers, and photos that might appear repeatedly alongside that email. The detective gets a peek into all 25,300 results—and can start chasing down the most promising leads without leaving behind any of the other results.

Tools such as these have allowed district attorney’s offices to go back to the case files of their successfully prosecuted sex cases, and reuse the phone numbers, names, emails, and physical addresses already established as evidence. The Memex tools allow these old cases to provide search terms to build new cases and prove criminal conspiracy, linking guys in prison to sex rings still operating.

One of the most useful tools is TellFinder, which pulls and organizes co-referenced information from sex ads. By finding commonalities in ads—the author’s “tells”—it can group together ads from the same author or organization, giving investigators a greater insight into the scope of the business. In one demo, Shen pulls up 869,000 current ads represented like population-density bubbles across the states. He zooms in to towns and jurisdictions and scrolls backward through dates, revealing where ads were posted and faded away over time. The map also shows phone numbers, emails, and physical addresses the ads have in common, and even photos with the same background (the same motel drapes and wallpaper in the background can lead detectives to a sex-trafficking site). With a few clicks, Shen shows how ads for one woman moved across the country, demonstrating the probable track of her being trafficked.

Another tool, called Dig, takes that co-referenced information and sorts it into a list that looks a bit like the results of an Amazon search. Along the side, key categories and terms allow investigators to filter the results down to just the information they’re looking for. Dig also takes TellFinder’s image-search capabilities and kicks them up a notch. “It’s just another way of looking at the same problem,” Shen explains. “And these are just examples—there’s no one way to use these tools.”

Some Memex tools have been specialized to perform similar tasks in the dark Web, crawling the otherwise unsearchable sites for specific information types.

White showed me another tool back in Seattle: Aperture Tiles. It makes formerly unmanageable amounts of information— think billions of moving data points on a map—manageable. To demonstrate, he combined motel addresses associated with sex trafficking, and the location information attached to online posts made near those addresses. (“Most people have no idea that when they’re accepting the permissions on a free app, it’s them and their data that’s the commodity,” he notes.)

Often, patterns emerged: The people posting the ads would drive from city to city around the U.S., deciding every few days to get out of Dodge, likely as a way to stay under law enforcement’s radar. Some people who posted frequently in the U.S. also posted frequently in Southeast Asia. What that means, White says, is a question only a full investigation can answer, but it’s reasonable to assume it indicates a connection with international sex trafficking.

On December 19, 2014, Froilan Rosado sat in an idling van outside a midtown Manhattan sex hotel, a pregnant 16-year-old in the passenger seat. In his late 30s, Rosado was the kind of guy who liked to post Facebook photos of him and his family dressed like convicts for Halloween, and selfies in mirrored shades with his hair braided into cornrows, a pencil goatee framing a scowl. Rosado was a pimp. Inside the hotel was his 18-year-old prostitute, “Flora.” Undercover cops had picked her up in a run-of-the-mill prostitution-sting operation.

But really, she was the victim.

Flora told investigators she’d been kicked out of her foster home and had nowhere to go. Rosado took her in, then started pimping her out. Investigators soon learned that Rosado had become an expert at luring girls and women into the street trade over social media; getting young woman already under his sway to contact girls on Facebook as young as 15. Once lured in, he kept them in line with violence, drugs, and promises of money. In one instance, he choked a girl who refused to obey. In a text, he referred to a girl as “fresh meat.” He put their photos in Backpage sex ads with a contact number. He’d take the call, book the dates, and wait outside to get his cut.

To build a stronger case against Rosado, the Office of Manhattan District Attorney Cyrus R. Vance Jr. wanted to track more girls. Flora didn’t know their full names, phone numbers, or whereabouts. And she didn’t really know the details of how Rosado covered his digital tracks. She didn’t know, for example, that he routinely deleted or changed his girls’ online ads, or changed their names, or switched out burner phones. And so the investigators had nothing that could connect Rosado to a larger prostitution ring, even while he ran his business over the phone from New York City’s Rikers Island jail.

They turned to Memex, which started collaborating with their offices in 2014. Analysts used early versions of Dig and TellFinder to mine Rosado’s invisible traces across deleted and current sex ads, and instantly linked photos, names, emails, phone numbers, and more girls. As Rosado continued his business from jail, investigators listened in as he mentioned new phone numbers, which they could then plug in to Memex and connect to the others. Soon they identified and located even more victims, building the evidence that linked Rosado to a prostitution ring, including 10 teenagers ranging from 15 to 18 years old, and a case that would stick. On September 15, 2015, nearly a year after Rosado was arrested, he was sentenced to seven-to-14 years in prison on charges of sex trafficking and promoting prostitution. Today, the Manhattan District Attorneys office employs Memex in all of its human-trafficking investigations—having screened 4,752 potential cases in the first six months of 2016 alone.

drugs

On a drizzling Tuesday this past November, I met Chris White at his new office in Microsoft’s town-size campus in Redmond, Washington, about a dozen miles northeast of Seattle. White’s directions led along route 520 to a parking garage and a modern glass-fronted building marked with the number 99, and inhabited almost exclusively by Ph.D.’s.

It was after-hours when White brought me past security and into the maze of offices filled with prototypes and experiments, and glass walls covered in equations. White had left DARPA in May 2015, just before his appointment there ended (the organization employs its researchers for only a limited amount of time in order to keep new ideas flowing and the talent pool fresh). But again, White felt that he’d been popped out of a rabbit hole and faced a crossroads.

At first he considered starting a company that would use automation and artificial intelligence to allow companies to do their own data analysis and online-security work. The idea was good enough to get interest from venture-capital groups. But then White thought about life as a startup CEO, the toll on his life with his fiancée (White married this past March), and the limited impact it would have on the world.

And so, instead of burning a decade being a CEO, White opted to make a thing—and an existence—that he considered simpler, yet bigger.

As a principal researcher in Microsoft’s Special Projects division, he gets to build on his work with Memex—making affordable, user-friendly, data-exploring and visualization tools for businesses (and journalists, and everyone else).

“The bar is even higher,” he says. “The question is no longer ‘Can we make something that works?’ It’s ‘Can we make something that works for a billion people?’”

White hopes his new project will, among other things, change people’s relationship with big data, and each other. It could also impact our democracy in ways no one has ever imagined.

Before I left, White flipped open his Lenovo ThinkPad X1 and opened a tool called Newman, a data-visualization tool that shows patterns in an email history—in this case, Jeb Bush’s email from eight years as Florida’s governor. In seconds, Newman sorted 250,000 emails into a nodal flower, showing who Bush had emailed and how often, who was CC’ed, where those were forwarded, and how quickly those emails were responded to. It was, in effect, an interactive map of influence and decision-making, the guts of democracy made transparent. White easily could have run the program over time to show relationships with lobbyists or donors, turning the candidate’s record round and round, like an apple in the hand.

“In a knowledge economy, this is power,” White says. “Right now there are only a few browsers, and they’re the only interface to the world’s information. With Memex, we thought we could really do something about that.”

Memex tools can show the movements of ISIS recruits or propaganda; links between shell companies and money laundering; the flow of illegal guns or labor; and heat maps showing the frequency of social-media mentions for words and ideas, and the intention around them, live across the map. They’ve been sought out to track an Ebola outbreak in West Africa, to understand how people moved in and out of hot zones, and to help the White House determine how to respond to the outbreak. They can also track and map moods and public sentiments as they ripple and change across the planet.

It’s not difficult to imagine how such transparency might inform our understanding of global opinions far beyond our limited views of Twitter or our personal Facebook feed. Even easier to imagine is the threat such transparency poses to the current Internet power and profit model—the advertisers who depend on paid experts to rank or review their product, or use SEO tricks or money to steer Internet searches toward their goods, and search companies that make their money selling access to that influence. Or dictatorships using those same techniques to influence and control citizens. Or even a democracy, where a handful of tech companies control the information flow—making it hard for even the most benevolent corporations to avoid an invisible bias in what tech users see, the information on which they base their choices and opinions.

If White is correct, Memex is just the beginning of a generation of tools that can help save the Internet from becoming a glorified shopping mall. That’s good. It’s much better than what we have now. But will it be profound? Will it make us better citizens, or more-realized human beings?

White watches me a moment, then almost smiles.

“These are very interesting and very important questions,” he says.

And ones he has only begun to shine his light on.