Why it’s so hard to make quick, accurate estimates on new COVID variants

A national sequencing bias likely led to an overestimate of the true prevalence of Omicron.
female doctor in protective clothing prepares test tubes for covid
Because Omicron spreads so quickly, small changes in our understanding of its initial prevalence can lead to hugely different model outcomes. Bihlmayerfotografie/Deposit Photos

At the end of December, the CDC sharply revised its estimates of the prevalence of Omicron relative to other COVID variants across the US. Before Tuesday, it reported that Omicron made up 73 percent of new COVID cases from December 12 to 18. Now, it puts the prevalence for that week at 23 percent, and estimates that Omicron made up 59 percent of new cases between December 19 and 25.

As Popular Science reported recently, that lower estimate makes it even harder to understand the US’s outbreak—Delta is clearly still a dominant force, and is likely responsible for most deaths.

Why such a dramatic swing? The CDC can’t actually see every COVID case, so it’s relying on a modeling concept called a nowcast. (Now + forecast.) That model predicts spread based on the characteristics of COVID variants and the general population, and tethers that prediction to real-world data.

Right now, there’s still considerable uncertainty over how quickly Omicron spreads relative to Delta, and how easily it’s able to infect people with prior immunity. “The mathematical models that describe COVID transmission dynamics are becoming more and more complex,” says Anass Bouchnita, an infectious disease modeler with the University of Texas at Austin’s COVID-19 Modeling Consortium. “We have several new effects”—like booster efficacy and prior immunity—“that need to be incorporated, and we have to do this very rapidly, given how quickly the landscape of COVID is changing.”

Bouchnita and colleagues produced a model of Omicron’s spread in mid-December, based on the data that was then available about the variant’s properties. “In the most pessimistic scenario we were expecting that the peak would reach about 500,000 cases per day,” he says. Since then, it’s become more likely that the Omicron surge will come on faster and sharper. “When we revise the projections, we expect that cases will reach 700,000 per day,” sometime in January.

But quirks in the underlying data also led to the change. Because Omicron spreads so quickly, small changes in our understanding of its initial prevalence can lead to hugely different model outcomes. To a certain extent, those data quirks are probably something pretty basic: The US was looking for Omicron cases, and so it found lots of them. “One of the reasons why the CDC adjusted their prevalence numbers is because so many labs specifically looked for [Omicron] samples,” says Krista Queen, who oversees COVID sequencing at Louisiana State University Health Shreveport. “When you go looking for that [variant], it’s going to inflate the prevalence.”

Omicron is relatively easy to spot with a PCR test, even without sequencing. PCR tests look for a handful of specific sequences from the virus’s genome. And it just so happens that some of them are designed to look for a sequence that, on Omicron, has mutated into an unrecognizable form. When such tests are used on a case of Omicron, they return a partial positive: three yeses for the intact sequences, and one no, called an “S gene target failure.” That signature isn’t a slam dunk, since it appears in a few other strains. But it gives researchers a clue.

[Related: Omicron isn’t overtaking Delta as quickly as the CDC thought—and that’s bad news]

Those obvious PCR results are a large part of how the US’s genomic surveillance system confirmed the presence of Omicron quickly. Health departments across the country looked for rare cases with S gene target failure, and sequenced them. “Early in December,” says Queen, “the Louisiana Department of Health”—like health departments across the country—”was prioritizing samples that had S-gene target failure.”

Queen’s team currently sequences every sample that comes in from LSU’s testing lab, which works with healthcare facilities and community testing sites, which gives them a clearer picture of Omicron’s prevalence. Still, that national sequencing bias likely led to an overestimate of the true prevalence of Omicron.

Then, modelers need to account for testing bias. Vaccinated people are less likely to get COVID at all, but are more likely to catch Omicron, with its immune evasion properties, than Delta. And some data suggests that vaccinated people are also more likely to seek testing. “It’s only people who go to get tested that we’re sequencing,” says Queen. “So are we seeing more Omicron because it’s actually more prevalent? Or because of who’s running out to be tested?”

But focusing too much on Omicron’s prevalence nationally can be misleading: The pandemic looks very different in different places, with deadly Delta outbreaks raging in the upper Midwest, as Omicron surges in highly vaccinated cities like Seattle.

More important is understanding the prevalence of Omicron at a local level. That’s partly because the disease behaves somewhat differently, spreading faster and more easily among the vaccinated, which calls for different safety precautions. But, Queen points out, it is also critical for allocating COVID drugs. “Demand right now for any therapy is through the roof,” Queen points out. Some antibody treatments for COVID patients don’t work as well for Omicron, so regions with the new variant need different medicines. “Are states able to meet the demand? That’s what matters much more than a gross overall estimate for the US.”