Researchers at City of Hope, and at the Translational Genomics Research Institute (TGen), have developed and tested a machine-learning approach that they suggest could one day enable earlier blood-based detection of cancer in patients, using only small blood draws. The technology is based on an algorithm called Alu Profile Learning Using Sequencing (A-PLUS), which the team developed, validated, and tested across four cohorts of patients, encompassing thousands of samples from patients with breast, colon and rectum, esophagus, lung, liver, pancreas, ovary or stomach cancers.

A-PLUS distinguishes individuals with cancer from those without cancer on the basis of the representation of Alu elements in their plasma cell-free DNA. Results from the newly reported study found that the A-PLUS tool identified half of the cancers among the 11 studied tumor types. The test was also highly accurate, with a false positive in only one out of every 100. Importantly, most of the cancer samples tested originated from people with early-stage disease, who had few or no metastatic lesions at diagnosis.

“A huge body of evidence shows that cancer caught at later stages kills people,” said Cristian Tomasetti, PhD, director of City of Hope’s Center for Cancer Prevention and Early Detection, and corresponding author of the researchers’ study in Science Translational Medicine. “This new technology gets us closer to a world where people will receive a blood test annually to detect cancer earlier when it is more treatable and possibly curable.” The researchers’ paper is titled “Machine learning to detect SINEs of cancer.” In their report, they concluded, “The evaluation of Alu elements may therefore have the potential to enhance the performance of several methods designed for the earlier detection of cancer.”

Tomasetti explained that 99% of people diagnosed with Stage 1 breast cancer will be alive five years later; however, if it is found at Stage 4, when disease has spread to other organs, the five-year survival drops to 31%.

Alu elements are short interspersed nuclear elements (SINEs) of ~300 base pairs, with more than one million copies spread throughout the human genome, the authors explained. While these elements are the subject of ongoing research, some have already been shown to be involved in the regulation of tissue-specific genes. “In cancer cells, they participate in structural changes, probably through homologous recombination given their widespread distribution throughout the genome and highly similar sequences … there is much precedent for Alu sequence elements being especially prone to epigenetic changes in various cancers,” the scientists wrote.

Instead of analyzing specific DNA mutations by looking for one misarranged letter out of billions of letters, the investigators devised a new approach to detect the difference in fragmentation patterns in repetitive regions of cancer and normal cell-free DNA (cfDNA). This fragmentomics approach requires about eight times less blood than required by whole genome sequencing, Tomasetti said.

When a cell dies, it breaks down and some of the DNA material of the cell leeches into the bloodstream. Cancer signals can be found in this cfDNA. The cfDNA of normal cells breaks down at a typical size, but cancer cfDNA fragments break down at altered spots. This alteration is hypothesized to be more present in repetitive regions of the genome. “Alu elements also reflect the altered fragmentation patterns found in the cfDNA of patients with cancer,” the scientists continued. They hypothesized that the representation of specific Alu elements might be different in the cell free DNA (cfDNA) of plasma from patients with cancer than in cfDNA from normal controls.

Because there are so many Alu elements in the genome, evaluating this hypothesis required the development of machine learning tools, and the team developed A-PLUS to distinguish individuals with cancer from those without cancer on the basis of the representation of Alu elements in their cfDNA.

The machine learning platform was trained and validated on four separate patient cohorts, totalling 7615 samples from 5178 individuals, including 2073 with solid cancers, and the remainder without cancer. “Samples from patients with cancer and controls were prespecified into four cohorts used for model training, analyte integration, and threshold determination, validation, and reproducibility,” the team explained.

Their results showed that in the validation cohort, A-PLUS alone provided a sensitivity of 40.5% across 11 different cancer types, at a specificity of 98.5%. Combining A-PLUS with aneuploidy and eight common protein biomarkers detected 51% of the cancers at 98.9% specificity.

The team said that the power of the A-PLUS could be ascribed to a single feature, “ … the global reduction of AluS subfamily elements in the circulating DNA of patients with solid cancer.” They further commented “ … our study shows that Alu element representations, in general, and AluS subfamily elements, in particular, are altered in the cfDNA of patients with many different cancer types … Future investigation of the mechanisms underlying their altered representation will be facilitated by their abundance in the genome and their similar sequences and structures.”

“Our technique is more practical for clinical applications as it requires smaller quantities of genomic material from a blood sample,” said co-first author Kamel Lahouel, PhD, an assistant professor in TGen’s Integrated Cancer Genomics Division. “Continued success in this area and clinical validation opens the door for the introduction of routine tests to detect cancer in its earliest stages.”

Tomasetti is poised to open a clinical trial in summer 2024 to compare this fragmentomics blood testing approach with standard-of-care in adults aged 65–75. The prospective trial will determine the effectiveness of the biomarker panel in detecting an earlier stage of cancer when it is more treatable.

Previous articleAI Can Help Accelerate Commercialization Strategies
Next articleImproving Flow through the Cell Therapies Bottleneck