Researchers at the CUNY Graduate Center have created an artificial intelligence model, Context-aware Deconfounding Autoencoder (CODE-AE), that can screen drug compounds to accurately predict efficacy in humans. In tests, the model was able to theoretically identify personalized drugs that could better treat more than 9,000 cancer patients. The researchers expect the technique will improving the accuracy and reduce the time and cost of drug discovery and development, and accelerate precision medicine.
“Our new machine learning model can address the translational challenge from disease models to humans,” said Lei Xie, PhD, a professor of computer science, biology and biochemistry at the CUNY Graduate Center and Hunter College. “CODE-AE uses biology-inspired design and takes advantage of several recent advances in machine learning. For example, one of its components uses similar techniques in Deepfake image generation.” Xie is senior author of the team’s published paper in Nature Machine Intelligence, titled “A Context-aware Deconfounding Autoencoder for Robust Prediction of Personalized Clinical Drug Response From Cell Line Compound Screening,” in which the authors concluded, “… CODE-AE provides a useful framework to take advantage of rich in vitro omics data for developing generalized clinical predictive models.”
The journey between identifying a potential therapeutic compound and FDA approval of a new drug can take well over a decade and cost upwards of a billion dollars. Accurate and robust prediction of patient-specific responses to a new chemical compound is critical both for the discovery of safe and effective therapeutics, and also for selecting an existing drug for a specific patient. However, it is unethical and infeasible to do early efficacy testing of a drug in humans directly. Cell or tissue models are often used as a surrogate of the human body to evaluate the therapeutic effect of a drug molecule. “In the early stage of drug discovery, cell line and other in vitro models have been extensively applied to screen drug candidates,” the team noted.
Unfortunately, the drug effect in a disease model often does not correlate with the drug efficacy and toxicity in human patients. “This discrepancy is responsible for the high cost and low success rate of drug discovery,” the team continued. And even for drugs that have been tested in clinical trials, patient responses to treatment can significantly vary. Moreover, “ … it is often difficult to collect a large number of coherent patient data with drug treatment and response history to reliably predict which patient will benefit from the drug.”
Developing an AI model for predicting patient-specific clinical drug responses from in vitro screens is challenging, but the new model can provide a workaround to the problem of having sufficient patient data to train a generalized machine learning model, said You Wu, a CUNY Graduate Center PhD student and co-author of the paper. “Although many methods have been developed to utilize cell-line screens for predicting clinical responses, their performances are unreliable due to data incongruity and discrepancies,” Wu noted. “CODE-AE can extract intrinsic biological signals masked by noise and confounding factors and effectively alleviated the data-discrepancy problem.”
As a result, CODE-AE significantly improves accuracy and robustness over state-of-the-art methods in predicting patient-specific drug responses purely from cell-line compound screens, the team suggests. “Extensive benchmark studies demonstrate the advantage of CODE-AE over state-of-the-arts in terms of both accuracy and robustness.” In their published paper, the researchers further described their of CODE-AE to screen 59 drugs for 9,808 cancer patients from The Cancer Genome Atlas. “Our results are consistent with existing clinical observations, suggesting the potential of CODE-AE in developing personalized therapies and drug-response biomarkers,” the they reported.
The researchers’ next challenge in advancing the technology’s use in drug discovery is developing a way for CODE-AE to reliably predict the effect of a new drug’s concentration and metabolization in human bodies. They also noted that, “In principle, integrating multiple omics data may benefit drug response predictions,” and suggested that the AI model could potentially be tweaked to accurately predict human side effects to drugs. And in conclusion, they stated, “Although CODE-AE is only applied to precision oncology here, it can be a general framework for other transfer learning tasks where two data modalities have shared and unique features …Thus CODE-AE provides a useful framework to take advantage of rich in vitro omics data for developing generalized clinical predictive models.”