To uncover cancer causing mutations in the genome, a team has built a deep neural network that can rapidly scan the entire genome of cancer cells and identify mutations that occur more frequently than expected—suggesting that they are driving tumor growth. This type of prediction has been challenging in the past because some genomic regions have an extremely high frequency of passenger mutations, drowning out the signal of actual cancer drivers.

This work is published in Nature Biotechnology in the paper titled, “Genome-wide mapping of somatic mutation rates uncovers drivers of cancer.

“We created a probabilistic, deep-learning method that allowed us to get a really accurate model of the number of passenger mutations that should exist anywhere in the genome,” said Maxwell Sherman, a graduate student at MIT. “Then we can look all across the genome for regions where you have an unexpected accumulation of mutations, which suggests that those are driver mutations.”

Researchers found mutations across the genome that appear to contribute to tumor growth in 5–10% of cancer patients. The findings could help doctors identify drugs that would have greater chance of successfully treating those patients, the researchers say. Currently, at least 30% of cancer patients have no detectable driver mutation that can be used to guide treatment.

Searching the genome for cancer driving mutations is not new. The practice has successfully yielded targets such as epidermal growth factor receptor (EGFR), which is commonly mutated in lung tumors, and BRAF, a common driver of melanoma. Both of these mutations can now be targeted by specific drugs. But it has been difficult to figure out if mutations in non-protein coding regions contribute to cancer development.

“There has really been a lack of computational tools that allow us to search for these driver mutations outside of protein-coding regions,” said Bonnie Berger, PhD, professor of mathematics at MIT and head of the computation and biology group at the Computer Science and Artificial Intelligence Laboratory (CSAIL). “That’s what we were trying to do here: design a computational method to let us look at not only the 2% of the genome that codes for proteins, but 100% of it.”

To do that, the researchers trained deep neural networks to search cancer genomes for mutations that occur more frequently than expected. They trained the model on genomic data from 37 different types of cancer, which allowed the model to determine the background mutation rates for each of those types.

“The really nice thing about our model is that you train it once for a given cancer type, and it learns the mutation rate everywhere across the genome simultaneously for that particular type of cancer,” Sherman said. “Then you can query the mutations that you see in a patient cohort against the number of mutations you should expect to see.”

Using this model, named Dig, the team was able to add to the known landscape of mutations that can drive cancer. Currently, when cancer patients’ tumors are screened for cancer-causing mutations, a known driver will turn up about two-thirds of the time. The new results of the MIT study offer possible driver mutations for an additional 5–10% of the pool of patients.

One type of noncoding mutation the researchers focused on is a “cryptic splice mutations.” Cryptic splice mutations are found in introns, where they can confuse the splicing machinery, resulting in introns being included when they shouldn’t be. Using their model, the researchers found that many cryptic splice mutations appear to disrupt tumor suppressor genes. The number of cryptic splice sites that the researchers found in this study accounts for about 5% percent of the driver mutations found in tumor suppressor genes.

Targeting these mutations could offer a new way to potentially treat those patients. One possible approach uses antisense oligonucleotides (ASOs).

Another region where the researchers found a high concentration of noncoding driver mutations is in the untranslated regions of some tumor suppressor genes. The tumor suppressor gene TP53, which is defective in many types of cancer, was already known to accumulate many deletions in these sequences, known as 5’ untranslated regions. The MIT team found the same pattern in a tumor suppressor called ELF3.

The researchers also used their model to investigate whether common mutations that were already known might also be driving different types of cancers. As one example, the researchers found that BRAF, previously linked to melanoma, also contributes to cancer progression in smaller percentages of other types of cancers, including pancreatic, liver, and gastroesophageal.

“That says that there’s actually a lot of overlap between the landscape of common drivers and the landscape of rare drivers. That provides opportunity for therapeutic repurposing,” Sherman said. “These results could help guide the clinical trials that we should be setting up to expand these drugs from just being approved in one cancer, to being approved in many cancers and being able to help more patients.”