The journal Nature today released a massive retrospective on the tenth anniversary of the Human Genome Project (officially celebrated June 26 of this year), which included two important pieces from genomics pioneers J. Craig Venter and Francis Collins. While retrospectives generally look backward, Venter and Collins are already looking to the next decade, one filled with free-flowing information, reams of phenotype data and multiple genomes per person. But the biggest developments in genomics won’t be a genomics development at all; it will be the biggest, baddest computing systems the world has ever seen.
Genome Data will be Free
The race to sequence the first human genome produced some fantastic things, not least of which was the first human genome. Ongoing prizes like the Archon X-Prize continue to offer research groups and academics the incentives to push the technological and scientific envelopes toward greater innovation. But cooperation, not competition, will get genomics to where it needs to be.
Collins notes that while legally binding policies must be in place to ensure individual privacy, genome data must be made available to all. Genomics is simply too big to end up like Big Pharma, with each individual entity clinging tenaciously to its proprietary data sets like commodities. The original Human Genome Project established an ethic of immediate data deposit that allowed others access to its data. That kind of openness and inclination toward collaboration will characterize the future of genome research.
Phenotype is the New Genotype
We’ve figured out how to sequence the genome, but that was only the beginning. Now we’ve got to figure out what it all means, and that means phenotype — behaviors, environmental factors, physical characteristics, etc. — will become just as important genotype in determining what the genome really means. And while phenotype may seem easier to characterize than genotypes, the task is actually far larger.
The vast complexity of human clinical and biological data is not easily digitized. As Venter notes, a query like “are you diabetic?” is simple enough to answer with a yes or no, but that one query raises many more: age, diet, medication, family history, vascular health, environment, etc. Only by pulling all that data into one place can we really begin to use the genome to revolutionize medicine. Which means . . .
The Next Big Genomics Breakthrough is Actually a Computing Breakthrough
Say we had all the genotype data and phenotype data we ever wanted. Without a means to process, analyze and cross-reference all that information, we would simply be floating on a sea of base pairs and phenotype data with no practical means of navigation. “The need for such an analysis could be the best justification for building a proposed ‘exascale’ supercomputer, which would run 1,000 times faster than today’s fastest computers,” Venter writes. Such mechanisms could unlock a future not where each person has access to his or her own genome, but to several genomes taken from various cell types within their bodies.
Collins agrees, emphasizing that there’s no substitute for good old fashioned elbow grease; large-scale research projects tediously logging reams of data and technological breakthroughs that allow us to make use of all that information will be the driving forces behind the next great strides in genetic research.
Graduate research assistants and computer scientists, sharpen your pencils. The future of genetics, it turns out, is in your hands.