Billions of billions of billions of maybe-helpful molecular compounds remain to be discovered, and used for targeting a whole host of medical problems. But who has the time to conjure a novemdecillion drugs? (That number is a 1 with 60 zeroes after it.) Nobody has the tools or the treasure to do that. To help narrow things down, scientists at Duke University and the University of Pittsburgh created an imaginary library of every compound that could exist. The sections are all marked out--now chemists can get to work filling them in.
The small molecule universe, or SMU as they call it, is the set of all feasible organic molecules below a certain weight. Small molecules can cross cell walls or bind to cells, while larger molecules above 500 Daltons are too big to be as effective. Chemists led by Duke's David Beratan built a representative library--in some ways more like an encyclopedia--which contains representations of all those feasible molecular compounds.
They made it using a new piece of software they called Algorithm for Chemical Space Exploration with Stochastic Search (ACSESS), which uses random statistics to search the unknown. It makes random changes to known molecules, maybe adding a nitrogen here or a carbon there. Synthetic chemists checked that the new combinations made sense, and the team used that information to further train their algorithm. Ultimately, by cataloging like with like, the team came up with 9 million examples that represent all the regions of the small-molecule universe. You can think of it like an encyclopedia of molecules grouped by type: One example could have billions of subsets.
In this stochastic voyage, the researchers found some interesting things. First, there are apparently large gaps in the existing compound collections, which is both a result of nature's proclivity for patterns and a result of human builders. Nature uses any available building blocks to create new compounds, while humans in the lab only have a few ingredients to work with. Second, the team found vast regions of emptiness, small molecule dark matter, where countless new compounds may fit in like unknown puzzle pieces.
This is helpful because chemists can use it like a treasure map--they may not know what they'll find, but the map provides some pathways to get there. "It facilitates the mining of chemical libraries that do not yet exist, providing a near-infinite source of diverse novel compounds," the authors explain. The source code for this algorithm is available to other researchers, and the paper has been accepted for publication in the Journal of the American Chemical Society.