SHARE

In Overmatched, we take a close look at the science and technology at the heart of the defense industry—the world of soldiers and spies.

YOU COULD PLAY Elden Ring, or maybe you’d rather try out God of War Ragnarok. Perhaps you’re more of a Candy Crush connoisseur. But if you want researchers to gather data on your gaming strategy, and what that might mean for real-world nuclear war, you might instead check out SIGNAL, courtesy of the Project on Nuclear Gaming (PoNG). The academic group behind the game wanted to show that video games could be used to gather large-scale data on human behavior and military strategy. Perhaps, they speculate, this digital tool could even hang on the belt along with traditional military research and formal war games.

In a pilot analysis, whose results were published a few months ago in the Journal of Peace Research, scholars analyzed more than 400 SIGNAL matches to see how the existence of tailored nuclear weapons—more on what those are in a moment—affected the likelihood that some world leader (or, in this case, a player pretending to be a world leader) might start an atomic war. 

For the uninitiated, tailored is a term that in this context refers to nuclear bombs that don’t just detonate with many megatons of energy. Tailored nuclear weapons may include boosted electromagnetic pulses—the grid-killing bursts of radiation that blast out right after explosions. They may be neutron bombs, which produce more radiation compared to their blast than other weapons. Or they may be bombs that are smaller and less destructive than their traditional counterparts, a category commonly called tactical nuclear weapons. 

For a while, such tactical weapons were a key part of the US arsenal: nuclear torpedoes, nuclear artillery shells, nuclear land mines. “Name a conventional weapons system today, and there used to be a nuclear weapon to fit that role,” says Geoff Wilson, director of the Center for Defense Information at the Project on Government Oversight.

In the early 1990s, the US largely phased these weapons out, although it currently has a couple hundred on hand. Russia has a couple thousand. Recently, and not unrelatedly, they’ve reentered American military discourse too. In 2018, the Nuclear Posture Review made way for a new low-yield weapon called the W76-2; the Biden administration’s 2022 review kept it on the table

Although the precise characteristics that make a nuclear weapon tactical, aka nonstrategic, are debatable, University of Southern California international relations professor Nina Rathbun recently wrote that “tactical nuclear weapons vary in yields from fractions of 1 kiloton to about 50 kilotons, compared with strategic nuclear weapons, which have yields that range from about 100 kilotons to over a megaton, though much more powerful warheads were developed during the Cold War.” A kiloton is the amount of energy that 1,000 tons of dynamite would release. For the record, both bombs that the US dropped on Japan during World War II would now be considered tactical. Estimates hold that those bombings killed anywhere between 110,000 and 210,000 people.

There are many issues related to the existence of tactical nuclear weapons, and here’s one of the biggest: Experts don’t agree on whether they make the world more stable or less stable, or whether they make nuclear war more or less likely. Maybe these bombs provide an eye-for-an-eye deterrent against other countries’ similarly sized weapons, meaning everyone is threatened away from launching any. But maybe these weapons make countries more willing to launch—and thus to break what Brown University’s Nina Tannenwald calls the “nuclear taboo”—because the consequences on the ground are less apocalyptic than what comes with the use of traditional, more powerful nuclear weapons. Wilson, of the Project on Government Oversight, falls into the latter camp.

Most troublingly, though, no one knows if a “limited” nuclear war, fought with comparatively small nuclear weapons, would actually stay limited and little. “Once you decide to let one of these things off the chain somewhere, the threat of using more of them increases,” says Wilson. 

Can data from a game be helpful?

There isn’t actually ground-truth data to support theories on how any kind of tailored nuclear weapon affects the course of war, because only one country has ever used nuclear weapons in war, and it did so back when no other nation had any. The physical data set has a sample size of one. “We certainly don’t want to have any kind of [real world] experimental data around the nuclear use,” says Bethany Goldblum, a co-author on the recent SIGNAL results paper who currently holds positions at Berkeley and Lawrence Berkeley National Lab. 

In the absence of such evidence, Goldblum and collaborators hoped that an online war game might provide significant fictional data—enough that it could be analyzed statistically. With a sample size of more than 400 games, they succeeded at that part.

War games in general are a longstanding means used by defense wonks and military leaders to figure out what other humans, beyond their borders, might do, and what they themselves might do in response or preemptively. “People role-playing together in a room is a type of war game that is often referred to as a tabletop exercise,” says Goldblum. It’s a common practice in think tanks and in government. “There are also war games in the form of strategic board games,” she continues. Some studies are surveys, which aren’t quite games but which present people with various written scenarios, to which people respond with what they would do. Usually, participants in these kinds of efforts are experts or practitioners in the relevant field. Digital simulations also exist to explore decision-making in different scenarios.

PoNG scientists, though, wanted their offering to be a little different: to live inside computers, be larger-scale, involve a wider and larger swath of people, immerse players in an environment where they have to live with their choices, and allow for iteration and experimentation. (PoNG comes from the University of California, Berkeley; the Nuclear Science and Security Consortium; and Lawrence Livermore and Sandia national laboratories, and the analysis in the Journal of Peace Research came from two of its members.)

So they came up with the “Strategic Interaction Game between Nuclear Armed Lands,” or SIGNAL, which was designed to investigate what Andrew Reddie, a cybersecurity professor at Berkeley and co-author on the results paper, calls their “toy problem”: Does adding tailored weapons to the arsenal increase the likelihood of nuclear use?

To test it out, the project team gathered players through social media, mailing lists, meetups, Amazon’s Mechanical Turk, and campus events, and also through the chance interest of internet passersby. 

Shall we play a game?

SIGNAL went live in May 2019, and it’s still up today. You can play if you convince two friends to log in at the same time as you, learn the somewhat complicated rules, and then stick it out till you nuke each other or don’t. Players are welcomed to a digital board filled with hexagonal tiles arranged in the shapes of three (fictional) countries, each of which is delineated by the color of its tiles: purple, green, or orange. The research team chose the unrecognizable national borders and non-triggering colors (no red, for instance) to decrease people’s tendency to read real-world situations into this fictional universe. “Minor states” are neutral in gray and can become allies.

A swelling soundtrack accompanies the game’s loading. Before you can make any moves, you first have to signal that you are about to do something by putting a generic marker on a hex, thus telling everyone that you might act on that piece of territory. If it’s a hex that belongs to another playing country, the two of you can negotiate in the chat box. Then, you can act—or not—on that tile: conventionally striking it with infantry or a missile, cyber attacking it, or navally attacking it. You may defend your own territory, build cities or military bases, or, of course, go nuclear. 

In SIGNAL, the scientists created a setup where two players were nuclear-armed countries and one was conventionally armed. But that setup had two different varieties: In one, which accounted for 209 of the matches in their analyzed dataset, the nuclear-armed countries had only traditional nuclear weapons. In the other, which represented 216 games, the nuclear nations had both traditional and tailored nuclear weapons. It is, the authors boasted, “the largest wargaming dataset collected to date,” at least “to the best of [their] knowledge.”

Players win by increasing their infrastructure and resources and defending their territory—pretty standard for strategy games like Civilization. The question the researchers had in mind related to who uses nuclear firepower to help them with those objectives.

It sounds robust, but there are some problems with SIGNAL. “Our game is hard to orient to,” admits Goldblum. “It’s a complex environment, by design, because you have to add enough complexity in order for it to be realistic.” And the graphics are… not stunning. They resemble, admits Reddie, a year-2000 version of Civilization

In addition, players can’t just log on and jump in: They have to have three people handy, or the game will just wait for others to join. 

In fact, reasonable skepticism about SIGNAL’s utility and biases, standard for a group of scientists, led the team to also create a survey-based war game, mimicking situations found in the video game, so they could compare people’s behavior in the two.

The results? The presence of tailored nuclear weapons does indeed seem to increase the likelihood that a player will take the conflict radioactive. The results also indicated that if tactical weapons are available, people are more likely to use them than the more destructive traditional ones.

The aftermath 

Those seem like neat conclusions, but that’s not the whole story: Despite the large amount of data gathered from the SIGNAL video game, the results from looking at the game-level trends weren’t actually statistically significant. They just trended toward supporting the survey’s findings—those listed above. Considering the game alone, though, the presence of tailored weapons only increased the likelihood of nuclear conflict by 2 percent with a margin of error of “plus or minus 20 percent, so we really can’t say much and need more data to reduce this error bar,” notes Goldblum. The effect was more pronounced, though still not with statistical significance, when the analysis removed the final round of play, in which players may have thrown their weapons “without fear of reciprocal action,” according to the paper. 

Demographic findings—about female players, college graduates, people over 29, people with national-security expertise or jobs—also didn’t rise to the level of statistical significance in a game-level analysis, though the potential trends warrant further study, in Goldblum’s view. 

That’s kind of disappointing for a game meant, in part, to get a big sample number. But the specific results of this game set’s conditions weren’t the point, says Reddie: PoNG’s creators wanted to prove that experimental war-gaming could be a thing, and he hopes to make that thing simpler for future researchers. “My primary interest is supporting the creation of a sandbox toolkit to actually make it easier to deploy this stuff in the absence of a million dollars of funding,” says Reddie. 

Goldblum sees it in a similar way. “The biggest takeaway is that experimental war-gaming offers this new tool for study,” she says. And, she adds, the fact that the results from the two different methods didn’t match up precisely provides a note of caution for other researchers: The tool you’re using likely has its own biases that influence players’ behavior.

Some tend to see this particular new tool as useful. Others, like a set of scholars from the RAND Corporation who wrote a letter to Science after the magazine published a 2018 piece about PoNG’s plans, definitely do not agree. The RAND team argued in part that data sets gathered from the public weren’t useful: To understand behavior in international conflict, you need players who are experts in geopolitics. “I’m not necessarily sure that’s wrong, right?” says Reddie. “But it’s a testable theory. They don’t have any data to suggest that they’re right.” 

They could gather some, though, if they compared such experts’ SIGNAL gameplay to that of a group of non-experts. In a lot of ways, though, all those human unpredictabilities—the possible dependence on experience, individual difference, inability to get behaviors to cohere or coalesce, strategy alterations based on whether an interaction is occurring on-screen or with a sheet of paper—are also part of the point. “Nuclear decisions would likely be made and influenced by fallible individuals acting under a tremendous amount of stress and time pressure,” says Wilson. In fact, he thinks war games are mostly useful for the pesky personhood that plagues them all. 

“The value of war games is, I think,” says Wilson, “that they show how unpredictable and often wrong our assumptions about humans are.”

Read more PopSci+ stories.