May 1, 2018 (Vol. 38, No. 9)

Big Data Plus Machine Learning Equals Scientific Advancement

More is better when it comes to Big Data and machine learning. This is particularly true in the fields of medicine and pharma. A report by Accenture estimates that by the year 2026, Big Data in conjunction with machine learning in medicine and pharma will be generating value at a prodigious rate: $150 billion/year.

This figure reflects how the tools of artificial intelligence (AI) are expected to help doctors, patients, insurers, and overseers reach better decisions, optimize innovations, and improve research and clinical trial efficiency.

Healthcare data comes from myriad sources: hospitals, doctors, patients, caregivers, and research. The challenge is putting all the data together in a compatible format and using it to develop better healthcare networks and protocols. This is where machine learning comes in.

The main purpose of machine learning applications specific to medicine and pharmacotherapy is to make data accessible and usable for improving prevention, diagnosis, and treatment as a matter of course. Pioneers in medicine and pharma machine learning are already addressing some key areas.

Machine Learning Applications in Medicine and Pharma

This article is informed by a TechEmergence analysis of AI initiatives undertaken by the five largest global drug makers. Whereas the analysis presents a broad survey, covering all the major trends of industry applications in life sciences and biotech, this article is more focused. It emphasizes six of the trends that TechEmergence believes will be most meaningful in the near term.

1. Diagnosis and Disease Identification

The biggest challenge in medicine is correct diagnosis and identification of diseases, which makes it priority one in machine learning development. A 2015 report indicates that in excess of 800 cancer treatments are in clinical trials. Knight Institute researcher Jeff Tyner stated in an interview that the problem was putting all the data together to get at what they can use. “That is where the idea of a biologist working with information scientists and computationalists is so important,” he stated.

Big companies such as IBM Watson Health are pioneering machine learning technology for this purpose. Together with Quest Diagnostics, it came up with IBM Watson Genomics, which uses machine learning to make cancer identification more precise. Berg, a biopharma company based in Boston, uses an AI platform on clinical trial patient data to develop new drugs for various diseases.

Other big players are DeepMind Health, a Google subsidiary, and P1vital Products, an affiliate company of P1vital, a contract research organization specializing in central nervous system disorders. DeepMine has been working on macular degeneration, and P1vital Products has been collaborating with University of Oxford researchers to conduct PreDicT, the Predicting Response to Depression Treatment study, which is focused on mental health issues such as depression.

2. Personalized Medicine

There is much research going on regarding the use of machine learning and predictive analytics in customizing treatment to a person’s unique health history. If successful, this can result in optimized diagnoses and treatment protocols. Currently, the focus is on supervised learning where doctors can use genetic information and symptoms to narrow down diagnostic options or make an educated guess about a patient’s risk. This can lead to better preventive measures. One of the pioneers in this is IBM Watson Oncology, which uses the patient’s medical history and personal information to design the best treatment.

The predicted surge in the use of advanced health-measuring mobile apps as well as microbiosensors and devices in the next 10 years will provide a wealth of data that can help point the way for effective research and development and better treatment protocols. Aside from better health management, personalized medicine also means lower costs overall, up and down the chain.

In line with prevention, many startups are getting on board the behavioral modification train. Catalia Health’s Cory Kidd expounded on this approach in an interview with TechEmergence. A few were featured in Entrepeneur, such as the SkinVision app for assessing skin cancer risk, and Somatix, a gesture detection app for wearable devices to help with smoking cessation.

3. Drug Discovery and Manufacture

Machine learning plays many roles in early-stage drug discovery, such as the development of new drug compounds, and in discovery technologies, such as next-generation sequencing. One of the first in this field is precision medicine, which makes identification of complex diseases and possible treatment modalities more efficient. The research uses unsupervised learning, which seeks data patterns without predicting outcomes.

Among the big players in precision medicine using machine learning is the MIT Clinical Machine Learning Group, which focuses on algorithm development. Microsoft’s Project Hanover uses machine learning to, among other things, develop a personalized drug protocol to manage acute myeloid leukemia. The UK Royal Society notes that machine learning in biomanufacturing for pharmaceuticals can help pharma companies optimize production by analyzing manufacturing process data and make it go faster.

A TechEmergence interview with Murali Aravamudan, founder and CEO of Qrativ, provided some insight on the subject of drug repurposing. Essentially, this is finding new conditions for an approved drug to make the most of research and development costs. For context, the average cost to discover and develop a prescription drug all the way to approval for the market is about $2.6 billion (as of 2014). The role of AI is to “connect the dots” for enormous amounts of clinical, genomic, and patient data to find the potential utility of an existing drug for a certain condition or disease. “Connecting the dots was harder a few years back,” Aravamudan observed. “Modern AI techniques in the last three years have given new sets of tools that … enable us to [pose] this triangulation question … in a more meaningful sense.”

4. Clinical Trials

Clinical trial research is a long and arduous progress. Machine learning can help make it less in various ways. One is by using advanced predictive analytics on a wide range of data to identify candidates for clinical trials for target populations much more quickly. Analysts at McKinsey describe other machine learning applications that can make clinical trials more efficient by facilitating tasks such as calculating ideal sample sizes, facilitating patient recruitment, and using medical records to minimize data errors.

5. Radiotherapy and Radiology

Harvard Medical School assistant professor Ziad Obermeyer, M.D., stated in a 2016 interview: “In 20 years, radiologists won’t exist in anywhere near their current form. They might look more like cyborgs: supervising algorithms reading thousands of studies per minute.” Currently, DeepMind Health with University College London Hospital is developing machine learning algorithms to increase the accuracy of radiotherapy planning by differentiating healthy tissues from cancerous ones.

6. Electronic Health Records

Support vector machines (technologies for sorting patient email queries) and optical character recognition (a technology for digitizing handwritten notes) are essential components of machine learning systems for document classification. Examples of these technologies are MathWorks’ MATLAB (a machine learning tool that has handwriting recognition applications) and Google’s Cloud Vision API.

One of the foci of the MIT Clinical Machine Learning Group is on machine learning–based technologies for intelligent electronic health records. The idea is to develop “robust machine learning algorithms that are safe, interpretable, can learn from little labeled training data, understand natural language, and generalize well across medical settings and institutions.”

The biggest obstacle to seamless electronic health records is the lack of synchronicity between the medical profession and the companies that develop electronic health record (EHR) systems. Healthcare AI developers need to understand the nature of healthcare data to provide automated EHR data management systems.

In communications with TechEmergence, Remedy Health’s co-founders William Jack and Nikhil Buduma stated, “If we can somehow seamlessly capture the relevant data in a highly structured, thorough, repetitive, granular method, we remove that burden from the physician. The physician is happier, we save the patient money, and we get the kind of data we need to do the game-changing AI work.”

Challenges Ahead

Investors seem confident that machine learning and AI will advance the life sciences and healthcare, but technological hurdles remain. Data privacy, for example, is a major issue. The most useful information is often personal medical data, which is difficult to access.

A U.K. study, however, shows that 83% of participants are willing to share their data for research as long as they remain anonymous. Drug development regulations require transparent algorithms. In other words, people need to understand how machine learning works.

It is not easy to find people with pharma expertise who also have an expertise in artificial intelligence. At TechEmergence, we’ve taken the opportunity to address the life sciences/data science talent divide, and we suspect that it will remain a serious issue for the coming half decade at least.

Daniel Faggella ([email protected]) is founder of TechEmergence. This article is based on a feature entitled “7 Applications of Machine Learning in Pharma and Medicine,” which originally appeared on

Previous articleConsortium Identifies 44 Variants as Risk Factors for Major Depression
Next articleFour Tips for Identifying Microbes in Your Facility