A study by researchers at Brigham and Women’s Hospital has demonstrated a proof-of-concept model that uses artificial intelligence (AI) to combine multiple types of data from different sources, and predict patient outcomes for 14 different types of cancer. “This work sets the stage for larger health care AI studies that combine data from multiple sources,” said research lead Faisal Mahmood, PhD, an assistant professor in the division of computational pathology at Brigham and associate member of the cancer program at the Broad Institute of Harvard and MIT. “In a broader sense, our findings emphasize a need for building computational pathology prognostic models with much larger datasets and downstream clinical trials to establish utility.”
Mahmood and colleagues reported on their work in Cancer Cell, in a paper titled, “Pan-cancer integrative histology-genomic analysis via multimodal deep learning.”
It’s long been understood that predicting outcomes in patients with cancer involves considering many different, such as patient history, genes, and disease pathology, clinicians struggle with integrating this information to make decisions about patient care. Experts depend on several sources of data, like genomic sequencing, pathology, and patient history, to diagnose and prognosticate different types of cancer. And while existing technology enables them to use this information to predict outcomes, manually integrating data from different sources is challenging and experts often find themselves making subjective assessments. Moreover, the authors wrote, “ … the subjective interpretation of histopathologic features has been demonstrated to suffer from large inter- and intraobserver variability, and patients in the same grade or stage still have significant variability in outcomes.”
Mahmood further noted, “Experts analyze many pieces of evidence to predict how well a patient may do. These early examinations become the basis of making decisions about enrolling in a clinical trial or specific treatment regimens. But that means that this multimodal prediction happens at the level of the expert. We’re trying to address the problem computationally.”
Mahmood and colleagues devised a means to integrate several forms of diagnostic information computationally to yield more accurate outcome predictions. “In order to address the challenges in developing joint imageomic biomarkers that can be used for cancer prognosis, we propose a deep-learning-based multimodal fusion (MMF) algorithm that uses both H&E WSIs [whole slide images] and molecular profile features (mutation status, copy-number variation, RNA sequencing [RNAseq] expression) to measure and explain relative risk of cancer death,” they wrote.
Their resulting AI models demonstrated the ability to make prognostic determinations while also uncovering the predictive bases of features used to predict patient risk—a property that could be used to uncover new biomarkers. The researchers built the models using The Cancer Genome Atlas (TCGA), a publicly available resource containing data on many different types of cancer.
They developed a multimodal deep learning-based algorithm that is capable of learning prognostic information from multiple data sources. By first creating separate models for histology and genomic data, they could fuse the technology into one integrated entity that provides key prognostic information. Finally, they evaluated the model’s efficacy by feeding it data sets from 14 cancer types as well as patient histology and genomic data. The results demonstrated that the models yielded more accurate patient outcome predictions than those incorporating only single sources of information.
“In this study, we present a method for interpretable, weakly supervised, multimodal deep learning that integrates WSIs and molecular profile data for cancer prognosis, which we trained and validated on 6,592 WSIs from 5,720 patients with paired molecular profile data across 14 cancer types …” they continued.
This study highlights that using AI to integrate different types of clinically informed data to predict disease outcomes is feasible. “ .. Our weakly supervised, multimodal deep-learning algorithm is able to fuse these heterogeneous modalities to predict outcomes and discover prognostic features that correlate with poor and favorable outcomes,” they noted.
Mahmood explained that these models could allow researchers to discover biomarkers that incorporate different clinical factors and better understand what type of information they need to diagnose different types of cancer. The researchers also quantitively studied the importance of each diagnostic modality for individual cancer types and the benefit of integrating multiple modalities.
The AI models are also capable of elucidating pathologic and genomic features that drive prognostic predictions. The team found that the models used patient immune responses as a prognostic marker without being trained to do so, a notable finding given that previous research shows that patients whose tumors elicit stronger immune responses tend to experience better outcomes.
The Mahmood lab has generated a research tool, the pathology-omics research platform for integrative survival estimation (PORPOISE), as an interactive platform that directly yields prognostic markers learned by the model, for thousands of patients across cancer types.
While the scientists’ proof-of-concept model reveals a newfound role for AI technology in cancer care, their research is only a first step in implementing these models clinically. Applying these models in the clinic will require incorporating larger data sets and validating on large independent test cohorts. Going forward, Mahmood aims to integrate even more types of patient information, such as radiology scans, family histories, and electronic medical records, and eventually bring the model to clinical trials.
“Future work will focus on developing more focused prognostic models by curating larger multimodal datasets for individual disease models, adapting models to large independent multimodal test cohorts, and using multimodal deep learning for predicting response and resistance to treatment,” the investigators concluded. “As research advances in sequencing technologies such as single-cell RNA-seq, mass cytometry, and spatial transcriptomics, these technologies continue to mature and gain clinical penetrance, in combination with whole-slide imaging, and our approach to understanding molecular biology will become increasingly spatially resolved and multimodal.”