Scientists from Carnegie Mellon University and Stanford say that a group of nonexperts, who worked via an online interface and received feedback from lab experiments, has produced designs for RNA molecules that are consistently more successful than those generated by the best computerized design algorithms.
The researchers also note that they gathered some of the best design rules and practices generated by players of the online EteRNA design challenge and, using machine learning principles, generated their own automated design algorithm, EteRNABot, which also performed better than prior design algorithms. Though this improved computer design tool is faster than humans, the designs it generates still does not match the quality of those of the online community, which now has more than 130,000 members, according to the university team.
The research report (“RNA design rules from a massive open laboratory”) will be published this week in the Proceedings of the National Academy of Sciences Online Early Edition.
“The quality of the designs produced by the online EteRNA community is just amazing and far beyond what any of us anticipated when we began this project three years ago,” said Adrien Treiulle, Ph.D., an assistant professor of computer science and robotics at Carnegie Mellon, who leads the project with Rhiju Das, Ph.D., an assistant professor of biochemistry at Stanford, and Jeehyung Lee, a Ph.D. student in computer science at Carnegie Mellon.
“This wouldn’t be possible if EteRNA members were just spitting out designs using online simulation tools,” Dr. Treuille continued. “By actually synthesizing the most promising designs in Das’ lab at Stanford, we’re giving our community feedback about what works and doesn’t work in the physical world. And, as a result, these nonexperts are providing us insight into RNA design that is significantly advancing the science.”
In the research being reported this week, the researchers tested the performance of the EteRNA community, EteRNABot, and two RNA design algorithms in generating designs that would cause RNA strands to fold themselves into certain shapes. The computers could generate designs in less than a minute, while most people would take one or two days; synthesizing the molecules to determine the success and quality took a month for each design, so the entire experiment lasted about a year.
“The EteRNA project connects 37,000 enthusiasts to RNA design puzzles through an online interface,” writes the Carnegie Mellon-Stanford team in PNAS. “Uniquely, EteRNA participants not only manipulate simulated molecules but also control a remote experimental pipeline for high-throughput RNA synthesis and structure mapping. We show herein that the EteRNA community leveraged dozens of cycles of continuous wet laboratory feedback to learn strategies for solving in vitro RNA design problems on which automated methods fail.”
In the end, Lee said, the designs produced by humans had a 99% likelihood of being superior to those of the prior computer algorithms, while EteRNABot produced designs with a 95% likelihood of besting the prior algorithms.
“The quality of the community’s designs is so good that even if you generated thousands of designs with computer algorithms, you’d never find one as good as the community’s,” noted Lee.
Though EteRNA players may not be scientifically trained, they nevertheless have instincts that, when bolstered by the lab experiments, can lead to new insights. “Most players didn’t have tactical insights on RNA designs,” continued Lee. “They would just recognize patterns—visual patterns. Scientifically, not all of these rules initially seemed to make sense, but people who were following them did better.”
The project is now looking at expanding its design regimen to include three-dimensional designs. They also are developing a template that researchers in other fields can use to turn scientific projects into online challenges.