Across scientific disciplines, large language models (LLMs) are being used to solve various research problems. At Carnegie Mellon University, scientists have used LLMs to create an intelligent lab partner that is capable of designing and remotely running chemistry research experiments. Details of the proof-of-concept system and its architecture are provided in a Nature paper published this week titled “Autonomous chemical research with large language models.”  

Gabriel Gomes, PhD, an assistant professor of chemistry and chemical engineering at Carnegie Mellon and two chemical engineering doctoral students, Daniil Boiko and Robert MacKnight designed the system which is called Coscientist. It uses OpenAI’s GPT-4, Anthropic’s Claude, and other LLMs to execute experimental processes. A scientist working with the system can design and run an experiment much faster, more accurately, and more efficiently than a human working alone.

“Our system demonstrates advanced reasoning and experimental design capabilities, addressing complex scientific problems and generating high-quality code,” the researchers wrote. “These capabilities emerge when LLMs gain access to relevant research tools such as internet and documentation search, coding environments, and robotic experimentation platforms.”

Included in the paper are descriptions of Coscientist’s capabilities including its ability to plan the chemical synthesis of known drugs and other compounds; search and navigate hardware documentation; execute high-level commands in an automated lab called a cloud lab; control liquid handling instruments; complete tasks that require the use of multiple hardware modules and data sources; and solve optimization problems by analyzing existing data. 

For example, a scientist could ask Coscientist to find a compound with specific properties. The system would then search the internet, any available technical documentation, and other sources, synthesize the information it finds, and then choose an appropriate experimental plan. That plan is then sent to the lab and completed by automated instruments. 

“We anticipate that intelligent agent systems for autonomous scientific exploration will bring tremendous discoveries, unforeseen therapies, and new materials,” the team wrote in the paper. “While we cannot predict what those discoveries will be, we hope to see a new way of conducting research given by the synergetic partnership between humans and machines.”

To demonstrate that Coscientist could be used in an automated lab environment, the Carnegie Mellon team partnered with the Emerald Cloud Lab, a Carnegie Mellon-alumni founded, remotely operated research facility.  Carnegie Mellon is working with ECL to establish a cloud lab at the university that will offer access to more than 200 pieces of equipment including instruments for cell culture, sample preparation, mass spectrometry, bioassays and much more that are all remotely controlled from a single, unified software interface. 

The Carnegie Mellon cloud lab will be set up to handle experiments in a wide range of disciplines including genetic engineering and synthetic biology as well as structural biology, biochemistry and physical chemistry. There are plans to add support for new disciplines such as cell biology and medicinal chemistry. Gomes intends to continue to develop the technologies described in Nature so that they can be used in the cloud lab at Carnegie Mellon and similar autonomous labs in future. 

“Beyond the chemical synthesis tasks demonstrated by their system, Gomes and his team have successfully synthesized a sort of hyper-efficient lab partner,” according to David Berkowitz, the chemistry division director for the National Science Foundation. “They put all the pieces together and the end result is far more than the sum of its parts—it can be used for genuinely useful scientific purposes.” 

On a final note, the paper does address potential safety concerns surrounding the use of LLMs. As part of their efforts, Gomes’ team investigated the possibility of instructing Coscientist to design and plan experiments aimed at synthesizing hazardous chemicals or controlled drugs. Details are provided in the paper’s supporting information. 

“I believe the positive things that AI-enabled science can do far outweigh the negatives,” Gomes noted “But we have a responsibility to acknowledge what could go wrong and provide solutions and failsafes.”

Previous articleNoncoding Role for mRNA in Olfactory Receptor Selection
Next articleNeuromuscular Junction Model Developed Using Pluripotent Stem Cells