Inside the Mysterious Dark Matter of the Human Genome
Wikimedia Commons

When scientists sequenced the human genome a decade ago, it was somewhat like looking at a blueprint in a foreign language — everything was marked in its proper location, but no one could tell what it all meant. Only about 1 percent of our genome codes for proteins that actually do anything, so the rest of our DNA has been like biology’s dark matter, acting in mysterious ways. Now, after years of monumental effort, scientists think they have some answers.

A five-year project called ENCODE, for “Encyclopedia of DNA Elements,” found that about 80 percent of the human genome is biologically active, influencing how nearby genes are expressed and in which types of cells. It’s not junk DNA, which was previously thought — instead, these non-coding regions of DNA could have major bearing on diseases and genetic mutations, researchers say.

The project will rewrite the textbooks, turning the architectural blueprint of the human genome into a control schematic and instruction manual that explains how genes turn on and off. These rules dictate anything from embryonic development to the process of aging.

MIT computer science professor Manolis Kellis is one of hundreds of scientists participating in the ENCODE project, and he explained that the project lays bare the nucleotide differences that make us individuals.

“What ENCODE allows you to do is provide an annotation of what each nucleotide of the genome does, so that when it’s mutated, we can make some predictions about the consequences of the mutation,” he tells MIT News.

ENCODE annotates all of the 3.2 billion combined A, C, G and T nucleotides that make up genes and their regulatory sections. It turned up some interesting fossil sections — DNA relics of our evolutionary history — and suggests some of these “pseudogenes” are not dormant after all, but still active as non-coding RNAs. More than 400 researchers conducted upwards of 1,600 experiments with 150 types of human tissue in the past five years to untangle all this activity. If ENCODE was presented in graphical form, the data it has generated so far would fill a poster 30 kilometers long and 16 meters high, according to the Associated Press.

The ENCODE project’s findings appear in more than 30 scientific papers this week, in Nature, Science and other journals.

The most interesting aspect of the study appears to be the surprisingly powerful function of gene regulation. Variants in our DNA that can cause disease are not always affiliated with the genes themselves, which is interesting from a genetic therapy standpoint. Rather, they’re often associated with DNA chunks that regulate those genes. Sometimes the regulatory chunks are close to the ones being regulated, but sometimes they’re very far away — at least from a linear perspective.

There’s a wealth of information and insight to be learned from these results, which you can explore with this helpful interactive explorer by Nature. The human genome was just the beginning.

[AP, ScienceDaily]