John Anderton is the chief of a special police unit in Washington, D.C. This particular morning, he bursts into a suburban house moments before Howard Marks, in a state of frenzied rage, is about to plunge a pair of scissors into the torso of his wife, whom he found in bed with another man. For Anderton, it is just another day preventing capital crimes. “By mandate of the District of Columbia Precrime Division,” he recites, “I’m placing you under arrest for the future murder of Sarah Marks, that was to take place today….”
Other cops start restraining Marks, who screams, “I did not do anything!” The opening scene of the film Minority Report depicts a society in which predictions seem so accurate that the police arrest individuals for crimes before they are committed. People are imprisoned not for what they did, but for what they are foreseen to do, even though they never actually commit the crime. The movie attributes this prescient and preemptive law enforcement to the visions of three clairvoyants, not to data analysis. But the unsettling future Minority Report portrays is one that unchecked big-data analysis threatens to bring about, in which judgments of culpability are based on individualized predictions of future behavior.
Of course, big data is on track to bring countless benefits to society. It will be a cornerstone for improving everything from healthcare to education. We will count on it to address global challenges, be it climate change or poverty. And that is to say nothing about how business can tap big data, and the gains for our economies. The benefits are just as outsized as the datasets. Yet we need to be conscious of the dark side of big data too.
Already we see the seedlings of Minority Report-style predictions penalizing people. Already we see the seedlings of Minority Report-style predictions penalizing people. Parole boards in more than half of all U.S. states use predictions founded on data analysis as a factor in deciding whether to release somebody from prison or to keep him incarcerated. A growing number of places in the United States — from precincts in Los Angeles to cities like Richmond, Virginia — employ “predictive policing”: using big-data analysis to select what streets, groups, and individuals to subject to extra scrutiny, simply because an algorithm pointed to them as more likely to commit crime.
But it certainly won’t stop there. These systems will seek to prevent crimes by predicting, eventually down to the level of individuals, who might commit them. This points toward using big data for a novel purpose: to prevent crime from happening.
A research project under the U.S. Department of Homeland Security called FAST (Future Attribute Screening Technology) tries to identify potential terrorists by monitoring individuals’ vital signs, body language, and other physiological patterns. The idea is that surveilling people’s behavior may detect their intent to do harm. in tests, the system was 70 percent accurate, according to the DHS. (What this means is unclear; were research subjects instructed to pretend to be terrorists to see if their “malintent” was spotted?) Though these systems seem embryonic, the point is that law enforcement takes them very seriously.
Stopping a crime from happening sounds like an enticing prospect. Isn’t preventing infractions before they take place far better than penalizing the perpetrators afterwards? Wouldn’t forestalling crimes benefit not just those who might have been victimized by them, but society as a whole?
But it’s a perilous path to take. If through big data we predict who may commit a future crime, we may not be content with simply preventing the crime from happening; we are likely to want to punish the probable perpetrator as well. That is only logical. If we just step in and intervene to stop the illicit act from taking place, the putative perpetrator may try again with impunity. In contrast, by using big data to hold him responsible for his (future) acts, we may deter him and others.
To accuse a person of some possible future behavior is to undermine the very foundation of justice.Today’s forecasts of likely behavior — found in things like insurance premiums or credit scores — usually rely on a handful of factors that are based on a mental model of the issue at hand (that is, previous health problems or loan repayment history). Basically, it’s profiling — deciding how to treat individuals based on a characteristic they share with a certain group. With big data we hope to identify specific individuals rather than groups; this liberates us from profiling’s shortcoming of making every predicted suspect a case of guilt by association.
The promise of big data is that we do what we’ve been doing all along — profiling — but make it better, less discriminatory, and more individualized. That sounds acceptable if the aim is simply to prevent unwanted actions. But it becomes very dangerous if we use big-data predictions to decide whether somebody is culpable and ought to be punished for behavior that has not yet happened.
The very idea of penalizing based on propensities is nauseating. To accuse a person of some possible future behavior is to undermine the very foundation of justice: that one must have done something before we can hold him accountable for it. After all, thinking bad things is not illegal, doing them is. It also negates the idea of the presumption of innocence, the principle upon which our legal system, as well as our sense of fairness, is based. And if we hold people responsible for predicted future acts, ones they may never commit, we also deny that humans have a capacity for moral choice.
In the big-data era we will have to expand our understanding of justice.The important point here is not simply one of policing. The danger is much broader than criminal justice; it covers all areas of society, all instances of human judgment in which big-data predictions are used to decide whether people are culpable for future acts or not. Those include everything from a company’s decision to dismiss an employee, to a doctor denying a patient surgery, to a spouse filing for divorce.
Perhaps with such a system society would be safer or more efficient, but an essential part of what makes us human — our ability to choose the actions we take and be held accountable for them — would be destroyed. Big data would have become a tool to collectivize human choice and abandon free will in our society. And even if a person isn’t thrown into a chic, night club-like standing prison as in the film Minority Report, the affect may look like a penalty nonetheless. A teenager visited by a social worker for having the propensity to shoplift will feel stigmatized in the eyes of others — and his own.
In the big-data era we will have to expand our understanding of justice, and require that it include safeguards for human agency as much as we currently protect procedural fairness. without such safeguards the very idea of justice may be utterly undermined.
By guaranteeing human agency, we ensure that government judgments of our behavior are based on real actions, not simply on big-data analysis. Thus government must only hold us responsible for our past actions, not for statistical predictions of future ones. And when the state judges previous actions, it should be prevented from relying solely on big data. And companies should make their big data activities open to scrutiny if it leads to substantial harm to many.
A fundamental pillar of big-data governance must be a guarantee that we will continue to judge people by considering their personal responsibility and their actual behavior, not by “objectively” crunching data to determine whether they’re likely wrongdoers. Only that way will we treat them as human beings: as people who have the freedom to choose their actions and the right to be judged by them.
This article was excerpted with permission from Big Data: A Revolution That Will Transform How We Live, Work, and Think (Houghton Mifflin Harcourt, 2013). Viktor Mayer-Schönberger is a professor of Internet governance and regulation at the Oxford Internet Institute in the UK. Kenneth Cukier is the data editor of The Economist