Scientists at Harvard Medical School have designed a versatile, ChatGPT-like AI model that is capable of performing an array of diagnostic tasks across multiple forms of cancers. The researchers, headed by Kun-Hsing Yu, MD, PhD, said the new AI system, Clinical Histopathology Imaging Evaluation Foundation (CHIEF), goes a step beyond many current AI approaches to cancer diagnosis.
Current AI systems are typically trained to perform specific tasks—such as detecting cancer presence or predicting a tumor’s genetic profile—and they tend to work only in a handful of cancer types. By contrast, the new model can perform a wide array of tasks and was tested on 19 cancer types, giving it a flexibility like that of large language models such as ChatGPT.
The researchers also suggest that while other foundation AI models for medical diagnosis based on pathology images have recently emerged, they believe that their new technology is the first that can predict patient outcomes and validate them across several international patient groups. The findings, the research team said, add to growing evidence that AI-powered approaches can enhance clinicians’ ability to evaluate cancers efficiently and accurately, including the identification of patients who might not respond well to standard cancer therapies.
“Our ambition was to create a nimble, versatile ChatGPT-like AI platform that can perform a broad range of cancer evaluation tasks,” said Yu, who is assistant professor of biomedical informatics at the Blavatnik Institute at Harvard Medical School. “Our model turned out to be very useful across multiple tasks related to cancer detection, prognosis, and treatment response across multiple cancers.” Yu is senior author of the team’s published paper in Nature, titled, “A pathology foundation model for cancer diagnosis and prognosis prediction.” In their paper, the team concluded, “Accurate, robust, and rapid pathology sample assessment provided by CHIEF will contribute to the development of personalized cancer management.”
Histopathology image evaluation is integral to the diagnosis of cancers and cancer subtype classification, the authors noted. Standard artificial intelligence methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task. However, they noted, “Although such methods have achieved some success, they often have limited generalizability to images generated by different digitization protocols or samples collected from different populations.”
The newly reported AI model, which works by reading digital slides of tumor tissues, detects cancer cells and predicts a tumor’s molecular profile based on cellular features seen and assessed on the image with superior accuracy to most current AI systems. The development builds on Yu’s previous research in AI systems for the evaluation of colon cancer and brain tumors. These earlier studies demonstrated the feasibility of the approach within specific cancer types and specific tasks.
“… we devised the Clinical Histopathology Imaging Evaluation Foundation (CHIEF) model, a general-purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation,” the investigators explained. “CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition.”
CHIEF was trained on 15 million unlabeled tile images chunked into sections of interest. “Tile-level unsupervised pretraining established a general feature extractor for hematoxylin–eosin-stained histopathological images collected from heterogeneous publicly available databases, which captured diverse manifestations of microscopic cellular morphologies,” the authors explained. The tool was then trained further on 60,000 whole-slide images (WSI) of tissues including lung, breast, prostate, colorectal, stomach, esophageal, kidney, brain, liver, thyroid, pancreatic, cervical, uterine, ovarian, testicular, skin, soft tissue, adrenal gland, and bladder. “Subsequent WSI-level weakly supervised pretraining constructed a general-purpose model by characterizing the similarities and differences between cancer types.”
Training the model to look both at specific sections of an image and the whole image allowed it to relate specific changes in one region to the overall context. This approach, the researchers said, enabled CHIEF to interpret an image more holistically by considering a broader context, instead of just focusing on a particular region. “We evaluated the performance of CHIEF in a wide range of pathology evaluation tasks, including cancer detection, tumor origin prediction, genomic profile identification, and survival prediction.”
Following training, the team tested CHIEF’s performance on more than 19,400 whole-slide images from 32 independent datasets collected from 24 hospitals and patient cohorts across the globe. Their evaluations showed that the CHIEF AI model could forecast patient survival across multiple cancer types and accurately pinpoint features in the tumor microenvironment that are related to a patient’s response to standard treatments, including surgery, chemotherapy, radiation, and immunotherapy. The tool in addition appeared capable of generating novel insights, identifying specific tumor characteristics previously not known to be linked to patient survival.
Overall, CHIEF outperformed other state-of-the-art AI methods by up to 36% on the following tasks: cancer cell detection, tumor origin identification, predicting patient outcomes, and identifying the presence of genes and DNA patterns related to treatment response. Because of its versatile training, CHIEF performed equally well no matter how the tumor cells were obtained, whether via biopsy or through surgical excision. “CHIEF consistently attained superior performance in a variety of cancer identification tasks using either biopsy or surgical resection slides,” the authors wrote. And it was just as accurate, regardless of the technique used to digitize the cancer cell samples. This adaptability, the researchers said, renders CHIEF usable across different clinical settings and represents an important step beyond current models that tend to perform well only when reading tissues obtained through specific techniques.
CHIEF achieved nearly 94% accuracy in cancer detection and significantly outperformed current AI approaches across 15 datasets containing 11 cancer types. In five biopsy datasets collected from independent cohorts CHIEF achieved 96% accuracy across multiple cancer types including esophagus, stomach, colon, and prostate. When the researchers tested CHIEF on previously unseen slides from surgically removed tumors of the colon, lung, breast, endometrium, and cervix, the model performed with more than 90% accuracy.
A tumor’s genetic makeup holds critical clues to determine its future behavior and optimal treatments. To get this information, oncologists order DNA sequencing of tumor samples, but such detailed genomic profiling of cancer tissues is not done routinely nor uniformly across the world due to the cost and time involved in sending samples to specialized DNA sequencing labs. Even in well-resourced regions, the process could take several weeks. It’s a gap that AI could fill, Yu said. Quickly identifying cellular patterns on an image suggestive of specific genomic aberrations could offer a quick and cost-effective alternative to genomic sequencing, the researchers said.
CHIEF outperformed current AI methods for predicting genomic variations in a tumor by looking at the microscopic slides. The AI tool successfully identified features associated with several important genes related to cancer growth and suppression, and predicted key genetic mutations related to how well a tumor might respond to various standard therapies. “… CHIEF substantially outperformed baseline methods in predicting genomic variations using pathology imaging profiles,” the scientists reported. “In particular, CHIEF predicted the mutation status of several oncogenes and tumor suppressors with higher performance (AUROCs > 0.8), such as TP53, GTF2I, BTG2, CIC, CDH1, IGLL5, and NRAS.”
CHIEF also detected specific DNA patterns related to how well a colon tumor might respond to immune checkpoint blockade immunotherapy. When looking at whole-tissue images, CHIEF identified mutations in 54 commonly mutated cancer genes with an overall accuracy of more than 70%, outperforming the current state-of-the-art AI method for genomic cancer prediction. Its accuracy was greater for specific genes in specific cancer types.
In addition, the team tested CHIEF on its ability to predict mutations linked with response to FDA-approved targeted therapies across 18 genes spanning 15 anatomic sites. CHIEF attained high accuracy in multiple cancer types, including 96% in detecting a mutation in EZH2, which is common in a blood cancer called diffuse large B-cell lymphoma. It achieved 89% for BRAF gene mutation in thyroid cancer, and 91% for NTRK1 gene mutation in head and neck cancers.
CHIEF successfully predicted patients’ survival outcomes using the histopathology images obtained at the time of initial diagnosis, the authors further reported. “In all cancer types and all study cohorts, CHIEF distinguished patients with longer-term survival from those with shorter-term survival.” CHIEF outperformed other models by 8%. And in patients with more advanced cancers, CHIEF outperformed other AI models by 10%. “We observed similar performance trends in patients with stage III and stage IV cancers, with CHIEF outperforming other methods by up to 10%.” In all, CHIEF’s ability to predict high versus low death risk was tested and confirmed across patient samples from 17 different institutions.
The model identified tell-tale patterns on images related to tumor aggressiveness and patient survival. To visualize these areas of interest, CHIEF generated heat maps on an image. When human pathologists analyzed these AI-derived hot spots, they saw intriguing signals reflecting interactions between cancer cells and surrounding tissues. One such feature was the presence of greater numbers of immune cells in areas of the tumor in longer-term survivors, compared with shorter-term survivors. That finding, Yu noted, makes sense because a greater presence of immune cells may indicate the immune system has been activated to attack the tumor.
When looking at the tumors of shorter-term survivors, CHIEF identified regions of interest marked by the abnormal size ratios between various cell components, more atypical features on the nuclei of cells, weak connections between cells, and less presence of connective tissue in the area surrounding the tumor. “In cancer samples from shorter-term survivors, high-attention regions exhibited larger nuclear/cytoplasmic ratios, more pronounced nuclear atypia, less stromal fibrosis, and weak intercellular adhesion,” the investigators continued.
These tumors also had a greater presence of dying cells around them. For example, in breast tumors, CHIEF pinpointed as an area of interest the presence of necrosis—or cell death—inside the tissues. On the flip side, breast cancers with higher survival rates were more likely to have preserved cellular architecture resembling healthy tissues. The visual features and zones of interest related to survival varied by cancer type, the team noted.
“In conclusion, CHIEF is a foundation model useful for a wide range of pathology evaluation tasks across several cancer types,” the authors stated. “CHIEF required minimal image annotations and extracted detailed quantitative features from WSIs, which enabled systematic analyses of the relationships among morphological patterns, molecular aberrations, and important clinical outcomes.”
The researchers said they plan to refine CHIEF’s performance and augment its capabilities. Future work will include conducting additional training on images of tissues from rare diseases and non-cancerous conditions, and samples from pre-malignant tissues before cells become fully cancerous. The team also aims to expose the model to additional molecular data to enhance its ability to identify cancers with different levels of aggressiveness, and further train the model to also predict the benefits and adverse effects of novel cancer treatments in addition to standard treatments.
“If validated further and deployed widely, our approach, and approaches similar to ours, could identify early on cancer patients who may benefit from experimental treatments targeting certain molecular variations, a capability that is not uniformly available across the world,” Yu further suggested.