Drug developers may have a new tool to search for more effective medications and new materials.
It's a computer algorithm that can model and catalogue the entire set of lightweight, carbon-containing molecules that chemists could feasibly create in a lab.
The small-molecule universe has more than 10^60 (that's 1 with 60 zeroes after it) chemical structures. Duke chemist David Beratan said that many of the world's problems have molecular solutions in this chemical space, whether itâ€™s a cure for disease or a new material to capture sunlight.
But, he said, "The small-molecule universe is astronomical in size. When we search it for new molecular solutions, we are lost. We don't know which way to look."
To give synthetic chemists better directions in their molecular search, Beratan and his colleagues -- Duke chemist Weitao Yang, postdoctoral associates Aaron Virshup and Julia Contreras-Garcia, and University of Pittsburgh chemist Peter Wipf -- designed a new computer algorithm to map the small-molecule universe.
The map, developed with a National Institutes of Health P50 Center grant, tells scientists where the unexplored regions of the chemical space are and how to build structures to get there. A paper describing the algorithm and map appeared online in April in the Journal of the American Chemical Society.
The map helps chemists because they do not yet have the tools, time or money to synthesize all 10^60 compounds in the small-molecule universe. Synthetic chemists can only make a few hundred or a few thousand molecules at a time, so they have to carefully choose which compounds to build, Beratan said.
The scientists already have a digital library describing about a billion molecules found in the small-molecule universe, and they have synthesized about 100 million compounds over the course of human history, Beratan said. But these molecules are similar in structure and come from the same regions of the small-molecule universe.
It's the unexplored regions that could hold molecular solutions to some of the world's most vexing challenges, Beratan said.
To add diversity and explore new regions to the chemical space, Aaron Virshup developed a computer algorithm that built a virtual library of 9 million molecules with compounds representing every region of the small-molecule universe.
"The idea was to start with a simple molecule and make random changes, so you add a carbon, change a double bond to a single bond, add a nitrogen. By doing that over and over again, you can get to any molecule you can think of," Virshup said.
He programed the new algorithm to make small, random chemical changes to the structure of benzene and then to catalogue the new molecules it created based on where they fit into the map of the small-molecule universe. The challenge, Virshup said, came in identifying which new chemical compounds chemists could actually create in a lab.
Virshup sent his early drafts of the algorithm's newly constructed molecules to synthetic chemists who scribbled on them in red ink to show whether they were synthetically unstable or unrealistic. He then turned the criticisms into rules the algorithm had to follow so it would not make those types of compounds again.
"The rules kept us from getting lost in the chemical space," he said.
After ten iterations, the algorithm finally produced 9 million synthesizable molecules representing every region of the small-molecule universe, and it produced a map showing the regions of the chemical space where scientists have not yet synthesized any compounds.
"With the map, we can tell chemists, if you can synthesize a new molecule in this region of space, you have made a new type of compound," Virshup said. "It's an intellectual property issue. If you're in the blank spaces on our small molecule map, you're guaranteed to make something that isn't patented yet," he said.
The team has made the source code for the algorithm available online. The researchers said they hope scientists will use it to immediately start mining the unexplored regions of the small molecule universe for new chemical compounds.
The research was supported by a grant from the National Institutes of Health (P50-GM067082).
Cite This Page: