We are at the cusp of a new era in therapeutic research, proclaimed Fabrice Chouraqui, PharmD, CEO of Cellarity. At the time, he was kicking off the second Single Cell and AI in Medicine Symposium (SCAIM23), which was held last May at Boynton Yards in Somerville, MA. But his remarks were clearly intended for a broader audience, one that includes GEN readers and anyone, really, who is interested in the way drug development is being revolutionized by new technology—specifically, innovations at the intersection of single-cell analysis, artificial intelligence (AI), and biology.
The themes outlined by Chouraqui were articulated throughout the symposium. For example, the event’s talks and panels emphasized the development of new genomics technologies, the accumulation of multiomics data, and the application of AI and deep learning models. All these activities are generating profound biological insights and speeding the introduction of novel drugs to developer’s pipelines. These drugs qualify as novel because they are currently undiscoverable with existing technologies, but they are becoming discoverable now that single-cell and AI technologies are being improved and—just as important—unified.
In a panel discussion, Laurens Kruidenier, PhD, chief scientific officer at Cellarity, noted that innovations across diverse disciplines are transforming medicine. He highlighted two main trends. The first is a shift from individual software solutions to well-connected, end-to-end pipelines. The second is a growing interest in building interdisciplinary teams.
Kruidenier said that the first trend—interconnectedness—will lead to better solutions for bigger problems. He suggested that the second trend—interdisciplinarity—will open the industry’s best minds and break down traditional barriers. One such barrier, he recalled, was much in force in his graduate school days. Kruidenier said that it prevented him from even thinking of leaving the biology building to visit anyone working in the computer science building.
Kruidenier observed that there is a “whole new generation of scientists” and that it includes scientists who are able and willing to learn the different languages used in drug discovery—languages that range across biology, chemistry, and computational science. He added that communicating across disciplines helps scientists develop better tools and use them to their full potential. He also highlighted a more general benefit of interdisciplinarity: it helps scientists cultivate the brainpower they need to ask the right questions at the right time.
Using large language models and genomics for cell and gene therapy was on the mind of Tommaso Biancalani, PhD, a distinguished scientist and the director of AI/ML at Genentech. He pointed out that these technologies could be used to advance cell engineering. For example, cells could be equipped with regulatory elements to enhance the expression of proteins of interest. According to Biancalani, the design of such elements is a machine learning question.
Enoch Huang, PhD, vice president, machine learning and computational sciences at Pfizer, noted that the advances of the last decade have yielded insights that allow biology to be probed in a more disease-relevant way than ever before.
Tracking cells, exploring pathogenesis
How are phenotypes in cells encoded, and how can these phenotypes be mapped? These are questions that Sydney Shaffer, MD, PhD, is working to answer in her laboratory at the University of Pennsylvania.
At the symposium, Shaffer said that omics data can offer rich snapshots, and that different snapshots can be pieced together to reveal connections between gene expression states and cell phenotypes. Although these connections can be uncovered computationally, Shaffer chooses to take a more direct approach. She uses single-cell data to predict how cells may change in response to treatment or through the course of disease. For example, she is interested in studying transcriptional shifts and their role in phenomena such as drug resistance and metastasis.
At the center of her research is cellular barcoding. In this technique, which predates single-cell genomics, mitochondrial DNA mutations serve as barcode information, enabling single-cell lineage tracing and the profiling of cells isolated from different tissues. Because the mutations are integrated into mitochondrial genomes, barcodes are passed on when cells divide.
To collect single-cell lineage information in the form of mitochondrial DNA mutations, Shaffer and colleagues employ high-throughput single-cell RNA sequencing. Then the scientists analyze this information to understand both transcriptional states and progenitor relationships. After determining which cells are sharing common progenitors, the scientists can build lineages and uncover the evolution of disease.
Shaffer recently led a study that used cellular barcoding in an analysis of cell dynamics in esophageal cancer. Lineage tracing and transcriptional profiling was used to create a clearer picture of the evolution of Barrett’s metaplasia, a condition that increases the risk of esophageal cancer. Shaffer and colleagues found that the condition is polyclonal, with lineages that contain all progenitor and differentiated cell types.
From single cells to ecosystems
One of the earliest spatial transcriptomics technologies, Slide-seq, was developed just over four years ago by a team that included Fei Chen, PhD, of the Broad Institute. Since then, he has continued to innovate in the spatial space.
At the symposium, Chen said that he is “fully on board with” a common goal in single-cell genomics. The goal is to manipulate cells along their developmental trajectories to alter the course of disease. However, that goal may be hard to reach if single-cell measurements are taken only in the context of dissociated cells.
Chen remarked that he wants to “study cells in their ecosystem.” Moreover, he described how his laboratory does so using a spatial sequencing method it developed. The method, called Slide-tags, combines high-throughput single-cell genomics and single-nucleus barcoding.
“[Cellular] nuclei from an intact fresh frozen tissue section are ‘tagged’ with spatial barcode oligonucleotides derived from DNA-barcoded beads with known positions,” Chen and colleagues wrote in a recent bioRxiv preprint (April 23, 2023). “Isolated nuclei are then profiled with existing single-cell methods with the addition of spatial positions.”
In an earlier bioRxiv preprint (March 8, 2023), Chen and colleagues demonstrated the power of a spatial technique that paired high-throughput single-nucleus RNA sequencing with Slide-seq. They built a complete cell atlas of the mouse brain from single-cell profiles of six million cells. In parallel, they reconstructed a three-dimensional mouse brain from 101 sections (1 section every 100 microns) to perform spatial transcriptomics. Putting those two data sets together allowed the researchers to see where cells are in the brain. Moreover, they uncovered new biology in the brain, including cell types that are not located in the three most studied regions of the brain—the cortex, cerebellum, or hippocampus. The atlas provides an opportunity to elucidate what those cells are doing and how they could be drugged.
The brain atlas, however, required the integration of two separate data sets: single-cell profiles and spatial transcriptomics data. This approach, Chen maintained, is less than ideal. He called it “a single-cell house divided.”
Ideally, one technology would allow for the collection of single cells while including their location information. The Slide-tags method solves this problem by taking single cells, localizing them, and then running them through the single-cell sequencing pipeline.
The Slide-tags method also allows spatially “tagged” nuclei to be channeled into standard workflows for single-nucleus RNA sequencing, single-nucleus ATAC sequencing, and T-cell receptor sequencing. (When Chen and colleagues performed T-cell receptor sequencing from single-cell libraries, they were able to recover and map both alpha and beta chains.) According to Chen, the Slide-tags method is a way of combining single-cell and spatial data that lets researchers “have their cake and eat it too.”
Predicting cell fates
Samantha Morris, PhD, associate professor of genetics and developmental biology at Washington University in St. Louis, started her talk with her “hot take” that cellular reprogramming has failed to deliver on its promises. She heads a laboratory that works to navigate the landscape of cell identity to generate clinically valuable cell types. For example, in 2014, the laboratory developed CellNet—a network biology–based platform to measure the identity of engineered cells against their in vivo correlates.
Morris and colleagues kept thinking about ways to deconstruct cell identity to increase reprogramming efficiency and fidelity. Eventually, they moved beyond CellNet, which uses bulk cells. They developed Capybara, a platform that measures cell identity at single-cell resolution.
According to Morris, current reprogramming cocktails are highly ineffective, necessitating a new way of predicting transcription factor biology. To dissect how transcription factors are controlling identity, she and her team built CellOracle, an approach to infer gene regulatory networks from single-cell ATAC sequencing and single-cell RNA sequencing. CellOracle, she said, could result in better cocktails.
Gene regulatory networks (GRNs) are the master regulators of cell identity. Morris reported that she and colleagues can use the GRN model to simulate shifts in cell identity following transcription factor perturbation to predict the identities cells will assume. They currently have base GRNs for 12 different species. Morris started a new company in 2022 called Capybio to improve upon the process.
Diverse approaches, common goals
In a panel on industry trends, Jonah Cool, PhD, science program officer at the Chan Zuckerberg Initiative, suggested that extraordinary progress in AI-powered single-cell technology calls for extraordinary collaboration. Bridges across academic, biotechnology, and pharmaceutical sectors must be built, he continued. But he admitted that coordinating efforts across all these sectors will be challenging.
Kruidenier agreed, noting that different perspectives will need to be accommodated as biological discovery comes to be driven less exclusively by biologists. He said that computationalists and biologists gain different insights from the same data set, and that computationalists can play an important role by reducing bias.
Mention of bias, or the “B” word, prompted Huang to share thoughts he had heard from an advocate of the unbiased approach. Huang said despite the expertise in disease pathways brought to the table by biologists, unbiased (or orthogonal) learning from the data, in an untargeted, systematic way, is critical. He added that revealing new targets in this way requires computationalists and biostatisticians.
One challenge is trying to get an organization to adopt that mindset. Culture is established by those who lead, noted Biancalani. And if the leader is good, and can inspire trust, people will follow. Biancalani said that at Genentech, such a leader is Aviv Regev, PhD, head, executive vice president, Genentech research and early development.
Another challenge is getting everyone to speak the same language. The best way to help leaders in AI, molecular biology, and computational biology become familiar enough with each other’s disciplines to talk with each other and work together is not obvious. It takes time and a shared language.
It is important, noted Cool, for collaborators to share a clear goal. It is also important for collaborators to pursue the goal flexibly. People need to understand that progress may not be linear. But a clear, common goal, in addition to flexibility, will be the key to building the bridges that are necessary to make advances.