The $1,000 Genome, and the New Problem of Having Too Much Information

The next sequence is even cheaper

Scientists needed $3 billion and 13 years to sequence the three billion base pairs encoded in a single human genome—the first time. By 2011, eight years after that first project was completed, the cost of sequencing a human genome had fallen to $5,000, in a process that took just a few weeks. And in January, Jonathan Rothberg, a chemical engineer and the founder of the biotech company Ion Torrent, unveiled an approach that is faster and cheaper still. He says his machine will be able to sequence a human genome, some 3.2 gigabytes’ worth of data, in two hours for just $1,000. Now thousands, and soon enough millions, of patients will have their genetic makeup laid bare, which presents an entirely new problem: How to analyze all that information?

Rothberg had introduced the first sequencing machine that could perform millions of chemical reactions on a fiber-optic array in 2004. But in 2010, he replaced the fiber-optic array with a semiconductor chip. In a powerful application of Moore’s law, which states that the number of transistors on a microchip doubles about every two years, the number of arrays Rothberg has been able to fit on his chips has grown rapidly. The more arrays he can squeeze onto a chip, the greater its performance and the cheaper the cost of sequencing. As if to prove the point, Rothberg sequenced Intel co-founder Gordon Moore’s genome last year on a silicon semiconductor chip with 1.2 million microwells. His new machine’s first chip, the Ion Proton I, has 165 million microwells. And Rothberg says he will release the Proton II, a chip with four times as many wells, later this year. The Proton II will make two-hour, full-genome sequencing possible.

Despite the falling cost of sequencing, personalized genomic medicine has thus far been used very selectively. A sequence from a single patient often isn’t enough to pinpoint the genes responsible for a disease, so even relatively cheap sequencing can quickly become prohibitively expensive. “If you knew you could find the answer in one patient, $1,000 versus $5,000 might not be a deciding factor,” says Richard Lifton, the chairman of the department of genetics at Yale School of Medicine. “But if you think you might need to study 20 patients with similar diseases, then you’re talking about $20,000 versus $100,000.” Lifton says he will use four of Rothberg’s Proton machines to help locate the genetic abnormalities that cause mysterious diseases in his patients.

Richard Gibbs, who runs the Human Genome Sequencing Center at Baylor College of Medicine, says he will use an Ion Torrent sequencer to investigate the genetic basis of Mendelian disorders, diseases caused by single-gene mutations, which afflict 25 million Americans. Last year, Gibbs was part of a team that sequenced the complete genomes of Noah and Alexis Beery, 14-year-old twins who were diagnosed when they were five with a rare movement disorder caused by a defect in how their body processes dopamine. Alexis had trouble breathing because of spasms in her larynx. By examining the twins’ genes and comparing them with that of their older brother, parents and grandparents, the team found that the siblings were also deficient in serotonin, allowing doctors to adjust their medication and normalize Alexis’s breathing.

And sequencing will only get cheaper. At IBM, researchers are at work on a $100 sequencer, a chip that could read bases as DNA fragments flow through nanometer-wide holes on its surface. When genome sequencing begins reaching millions of patients, it will help address the most common problems in medicine. St. Jude Children’s Research Hospital in Memphis, Tennessee, is now sequencing the DNA in cancer cells in pediatric patients to identify the gene mutations that lead to childhood cancers. Jay Shendure, a professor of genetics at the University of Washington, says that in 10 years sequencing will be routine. One’s genome could appear alongside other standard medical information, like blood pressure.

The day when a genome is seamlessly incorporated into everyone’s medical information will not arrive as quickly as $1,000 sequencing. After all, medicine isn’t governed by Moore’s law. Soon the price of sequencing will fall below the price of storing the data it generates. Two companies, GenomeQuest and DNANexus, now host genomes on the cloud for scientists and doctors to access. Doctors will need to be trained to apply genomic information to standard medical practice; the National Human Genome Research Institute has awarded more than $80 million for this purpose. “It’s not a system that moves very quickly,” Shendure says, “but it will happen.”

Decoding the Double Helix

The Y-axis is the number of incorporated base pairs per well. The X-axis is the “flow” or well number, with corresponding DNA base pairs.

DECODING THE DOUBLE HELIX

To determine the order of nucleotide bases in a genome—the As, Gs, Cs and Ts in our DNA—scientists attach single strands of DNA fragments to the surface of micron-wide beads. The beads are centrifuged into microwells on the surface of an Ion Proton chip. Technicians place the chip inside a machine, where it is flooded with one of the four nucleotides at a time. The machine looks for nucleotide matchups, building a complementary strand of the patient’s double helix.

When matchups occur, a positively charged hydrogen ion is released. A metal sensor under the wells registers the increased charge, and transistors beneath the well convert the charge into a voltage. Software determines which base was incorporated, and a resulting chart [above] reassembles the fragments into a whole genome.

Check out more from our Future of Medicine issue here.