As our ability to sequence genomes has skyrocketed, allowing us to churn out A’s, C’s, G’s, and T’s at breakneck speed, our capacity to decipher the sequences has not kept up. This issue was discussed at a recent meeting organized by Advances in Genome Biology and Technology. “We have well exceeded our ability as humans to deal with data,” said Eric Topol, MD, founder and director of the Scripps Research Translational Institute, professor, molecular medicine, and executive vice president of Scripps Research. “We need help from machines.” To have high-performance medicine, he asserted, “we need high-performance computing.”
It’s not exactly a new message—two decades ago, J. Craig Venter famously harnessed a commercial supercomputer to help Celera Genomics produce its first human genome assembly. But more and more companies are heeding Topol’s call, using artificial intelligence (AI) to help turn genomic data into knowledge, with the ultimate goal of using it to inform therapeutic interventions and improve patient outcomes. Here, we examine several companies with ambitious plans to bring AI to precision medicine.
Bringing “wisdom” to genomics
Sophia Genetics started in Switzerland in 2011. The company’s original location, perhaps, played an important role in its global footprint of today. “We had to operate internationally,” notes Kevin Puylaert, the company’s general manager of North America. “If we had started in the United States, we may not have felt the need to go global.”
The goal at Sophia (whose moniker means “wisdom” in Greek) is to “democratize data-driven science,” according to Puylaert. Sophia does not collect samples or perform sequencing. Rather, hospitals send their data to Sophia’s AI platform for genomic variant detection, annotation, and interpretation. The report is then returned to the hospital to inform the clinician’s diagnosis.
AI is at the heart of the platform, which uses statistical analysis, pattern recognition, and machine learning to analyze the data. Training its algorithm on the sample types and the sequencing and enrichment technologies, Sophia can analyze many more patients and collate the information in a manner that surpasses human ability.
Perhaps even more critical to the company’s success, however, is the number of patients in its platform. “Of course, you want to have the best possible algorithm,” explains Puylaert. “But you can’t improve [the algorithm] by an order of magnitude. If you want to have a smarter AI, you need more data.” Data are something that Sophia has a lot of—working with 1000 institutions in 81 countries and over 380,000 genomic profiles, analyzing more than 15,000 cases a month.
This wide footprint allows for diversity both in the genomes included in the database and in the knowledge and training of the participating clinicians. When many experts collaborate, declares Puylaert, “new ideas will come from unexpected places.”
What is sorely needed in this area, according to Heidi Rehm, PhD, chief genomics officer in the department of medicine at Massachusetts General Hospital (MGH), is access to “good, detailed, longitudinal, phenotypic data as well as other evidence used to interpret genetic variants.” She adds that “AI could be helpful” but admits that it relies on structured and computable data.
“There are many companies that are launching AI and algorithms, but they need standardized data to operate well,” agrees Puylaert. Although genomic data can be produced cheaply, they are also noisy. Although people have “tricks” to remove that noise, everyone has their approach, he notes. What someone might call noise, others might call a pathogenic variant. Hence, people are diagnosed differently depending on where you send the sample.
Standardizing data is crucial, Puylaert emphasizes, but is “very painful.” It’s a lengthy and resource-intensive process. Each new lab that works with Sophia goes through a program to make sure that the data are of sufficient quality to be shared within the platform. Sharing data levels the playing field and allows a hospital in Bogotá to diagnose their patients with the same information as a hospital in Boston. But as Puylaert points out, in a competitive environment like the United States, some institutions could feel pressure to keep competitive knowledge to themselves.
Regardless of the hurdles, Sophia’s goal is to generate reports that “democratize a more predictive, preventive, personalized, and participative medicine” according to chief executive officer Jurgi Camblong, PhD.
No sequencer? No problem
Eric Lefkofsky has spent his career “structuring unstructured, messy, data.” The Groupon founder realized, when someone close to him was diagnosed with cancer, the importance of bringing technology to the physicians that were seeing patients every day. This motivated Lefkofsky to launch Tempus, which uses advanced machine learning on genomics and AI-assisted image recognition to build a platform that is designed, like the Sophia platform, to help physicians make better decisions about diagnosis and treatment.
One big difference between the two companies is that, whereas Sophia receives information, Tempus welcomes samples. Introduced in October 2017, Tempus’ sequencing panel, Tempus xT, analyzes 595 genes related to “diagnosis, prognosis, and therapeutic targeting of cancer.” Since then, they have added whole exome sequencing and a liquid biopsy panel. Tempus sequences DNA and RNA and gathers clinical data—namely phenotypic, therapeutic, and outcome response data. In an interview at the 2017 Fortune Brainstorm Health Conference, Lefkofsky said that gathering these data in the same place can “begin to answer some very basic questions.” He added that these data should “flow freely,” not just to clinicians and researchers, but also to insurance companies and biotech companies. But, he noted, “the system is broken.”
Based in Chicago, Tempus has already raised $520 million. The company has also been amassing some impressive talent. A notable recent addition is Joel Dudley, PhD, the founding director of the Institute for Next Generation Healthcare at the Icahn School of Medicine at Mount Sinai. (He comes to Tempus after an illustrious career in precision health and genomics.) Another is Lauren Silvis, JD, who served as the FDA’s former chief of staff under Scott Gottlieb, MD. (Silvis will join Tempus as the senior vice president of external affairs.)
Talent they will need, as the company has stiff competition—largely from the decade-old Foundation Medicine. Acquired by Roche last year, Foundation’s genomic profiling assays are designed to match patients with targeted therapies, immunotherapies, and clinical trials based on their genetic variants. Based in Cambridge (but planning to move to new space in the Boston Seaport district by 2023), Foundation, like Tempus, takes a centralized approach to applying AI to genomics by analyzing patient samples being sent to them. In late 2017, the FDA gave the company the green light to market FoundationOne CDx, which detects mutations in 324 genes, select gene rearrangements, and genomic signatures including microsatellite instability and tumor mutational burden—for use in all solid tumors.
“There are pluses and minuses to these types of companies,” says Elaine Mardis, PhD, co-executive director of the Institute for Genomic Medicine at Nationwide Children’s Hospital. She tells GEN that, because they have a lot of bandwidth and resources, and because of the motivations built into their business models, “they do a better job at accumulating, annotating, and curating genomic data when compared to similar academic efforts.” If you are worried about accessibility and democratization of these tests, “they may play an important role in ensuring equal patient access to testing.”
But, she says, clinicians who are trained to interpret genomic data are more likely to access resources like ClinVar—a free archive of relationships among medically important genetic variants and phenotypes—to aid their interpretation, diagnosis, and treatment decisions. “Both are important,” she concludes.
A deep dive
As genome sequencing gets cheaper, there is a trend, notes Brendan Frey, PhD, chief executive officer and co-founder of Deep Genomics, to “just sequence more genomes.” But patients receive only a small fraction of useful information.
Trained by AI pioneer Geoffrey Hinton, PhD, Frey has spent his career working in machine learning as a professor at the
University of Toronto. In 2002, AI became personal for Frey when he and his wife learned that the baby she was carrying had some concerning genetic test results. He says that he learned the “value in reducing uncertainty” and, as a result, changed the trajectory of his group’s research to focus on using machine learning techniques to understand the genome.
The problem, Frey explained during a talk at an EmTech Digital event organized by MIT Technology Review, is our current inability to connect genotype to phenotype in an explanatory way that is reliable, scalable, and trustworthy.
He noted that current approaches—for example, the GWAS approach or the collection of lots of genomes—are not going to work. No matter how much data we collect, we cannot close the gap, he insisted.
The key to Deep’s approach is, unlike correlative approaches, the ability to provide explanatory information. If there is a pathogenic variant that leads to a phenotype, Deep can work backward through the architecture to understand why that variant is leading to the disease. This will provide different ways to interfere with the problem, whether it’s modulating a protein level or using CRISPR gene editing. It not only allows prediction, as in the example of hopeful soon-to-be parents, but also has the ability to explain a variant within a specific context.
How is Deep going to do it? Frey uses an analogy that is based on teaching a child how to read. You don’t just give your child Tolstoy to read. Rather, you present simple stories first. Because, he explains, no matter how many textbooks you put in front of the child, if you don’t take the time to teach them what each word means and how grammar works, they won’t learn to read.
Deep uses datasets being built around the world by academic researchers and private institutions that measure the changes in DNA, one picture book at a time. Deep is building AI systems that understand the relationship between the DNA variants and processes going on in the cell. Frey asserts that this is the way forward to understand how to consider types of therapies.
Unlike speech and vision, two areas where AI’s contributions seem to be extensions of our own abilities, the processing of genomic information seems alien. The genome, Frey points out, is “written in a language that is foreign to us and corresponds to processes that we don’t observe in our everyday life.” We need superhuman intelligence—or AI—to understand how genetic variants lead to disease and to close the “genotype–phenotype” gap. As Frey emphasizes, “humans are horrible at reading the genome.”
Drugging the Regulatory Genome—30 Years in the Making
Next year marks the 30th anniversary of the launch of the Human Genome Project, and yet a major impediment in drug development remains: a lack of understanding of disease biology. The map of the genome provided the code for producing proteins, leading to medicines targeting genetic mutations. But that was only 2% of the puzzle.
Many diseases are driven by abnormal expression of genes, but how genes are turned on, off, up, or down remained a mystery. Something goes on in the 98% percent of the genome, once considered “junk” because it doesn’t code for proteins.
Rapidly growing science is now elucidating how that 98% determines the function of every cell by controlling which genes are expressed at what time in what amounts. Understanding this “regulatory genome” will help scientists create a new generation of medicines. These medicines will be used to control the expression of genes. Already, experimental regulatory medicines are are starting to work their way through clinical trials.
For example, Syros Pharmaceuticals is investigating SY-1425 in a Phase II trial in acute myeloid leukemia patients with a highly active regulatory region that prevents blood cells from properly differentiating. The company’s second program, a CDK7 inhibitor in a Phase I trial in ovarian and breast cancer patients, targets a regulatory component to lower the expression of oncogenes and prevent cell proliferation.
“Drugging the regulatory genome represents the next frontier in medicine,” says Eric Olson, Syros’ chief scientific officer. “By controlling the expression of both normal and abnormal genes, we may find a way to treat diseases that have eluded other genomics-based approaches.”