Mammoth Project to Digitize the Tree of Life Could Uncover Thousands of New Species

Microsoft's Photosynth software will help scan and catalog 3-D models of specimens for analysis over the Web, anywhere
A presentation of entomology specimens arranged within one aisle of the Entomology Department compactor collection cabinets. Designed to illustrate the size and scope of the Entomology collection. Chip Clark/National Museum of Natural History, Smithsonian Institution

Over the past 20 years, Richard Pyle figures he’s discovered 100 new species of fish. But he’s identified only one fifth of them. Pyle, an ichthyologist at Bishop Museum in Hawaii, isn’t a slacker—he spent hundreds of hours tracking down those fish. It’s just that proving that a new species is unique can be as tough as finding it in the first place.

“There are literally thousands of new species sitting in storage at museums,” says Quentin Wheeler, a renowned taxonomist and founder of the International Institute for Species Exploration at Arizona State University. Identifying these species isn’t just to satisfy curiosity; we need a clear understanding of species in order to best organize conservation efforts. Here’s the snag: To verify a new species, a taxonomist must examine the type specimens—the preserved representative samples of every known plant and animal—of each similar existing species to document the differences. Some animals must be compared with hundreds of specimens, and those might be anywhere in the world. Many are too fragile to ship. But Wheeler thinks he has a solution that could dramatically speed up the classification and identification of species.

This month, Wheeler and his collaborators at Microsoft and the Woods Hole Oceanographic Institution in Massachusetts applied for National Science Foundation funding to create digital type specimens—”e-types”—for the more than one million known insects. To generate an e-type, a curator will simply slide the bug into a custom-built scanner that will take 100 or so 20-megapixel images and stitch them into a three-dimensional model using Microsoft’s Photosynth software. With a dozen of these machines working through the archives of the world’s natural-history museums, Wheeler expects that the effort could digitize all the insects in five years and, with additional funding and projects, the rest of the world’s 1.8 million known species within a decade.

Every year, taxonomists introduce about 20,000 species. In the same period, an estimated 30,000 vanish. At that rate, the majority of Earth’s inhabitants could disappear before they’re known to science. Simply by making it easier to examine existing organisms, Wheeler says his project will help bump the species description rate to 200,000 a year—a clip that could make it possible to name the world’s commonly estimated 20 million unknown species within 50 years.

Some scientists, however, think even e-types won’t get the job done fast enough. Genetic testing, they argue, is the key. Leading this movement is Canadian evolutionary biologist Paul Hebert, who wants to assign every species a DNA “bar code” based on variations in a gene that nearly all organisms possess. A new species would be identified not by its physical divergence from existing type specimens but by the difference between its bar code and all the others. Because the system is fast and cheap—the technique requires basic laboratory skills and can process 95 samples in two hours for $10 apiece—Hebert says it could uncover every unknown plant and animal by 2025.

But bar-coding is fast and cheap because it uses just one piece of one gene—and that makes it prone to errors. In 2008, Pyle made a definitive distinction between two new fish species, even though their bar codes were essentially identical. “It’s a blunt instrument,” Wheeler says of the DNA system. “One gene just isn’t sufficient.” Looking at actual specimens is an art, he says. Plus, it’s fun. Although he admits that someday it might be possible to define new species using only DNA, he doesn’t like that idea much: “That would make taxonomy too boring to be worth doing.”