Recent developments in the generation and analysis of patient data are paving the way to a new era of precision medicine. Precision or personalized medicine harnesses patient-specific data to create therapeutic strategies precisely tailored to individual patients. Today, large amounts of patient data are generated on a scale that is orders of magnitude higher than that reached even a decade ago. Through advanced predictive analyses, this information has the potential to enable the diagnosis, treatment, and prevention of disease at a highly personalized level.1 However, the development of such complex predictive models depends upon the capacity to organize, manage, and interpret the huge amounts of data involved.
Given the challenges associated with handling large volumes of multidimensional data using conventional data management tools, organizations are increasingly turning to platforms that allow them to get the most from their “big data.” In this article, we consider how the latest cloud-based informatics platforms are translating the goals of precision medicine into reality.
The big data revolution: Progress and challenges
Precision medicine research relies on the availability of large volumes of patient-specific data. With the introduction of high-throughput techniques such as next-generation sequencing (NGS), the volume of data that can be collected has been massively increased. Molecular profiling approaches that employ “panomic” technologies (combinations of genomic, epigenomic, proteomic, and metabolomic methods) offer a powerful means of characterizing individual patients.2 Coupled with advances in medical imaging and sensor technology, these developments have dramatically increased the scale and scope of patient-specific data available.
Furthermore, vast amounts of data can be gathered from patients themselves, using mobile technology and personal smart devices.3 Bluetooth-enabled smartphones can be used to gather data from wearable or implantable sensors, and GPS technology can be harnessed to generate continuous data on a patient’s environment. Smartphone apps have already been developed that can gather information relevant to specific diseases. A pioneering example is the Asthma Mobile Health Study,4 a project led by the Icahn School of Medicine at Mount Sinai, New York, which analyzed multidimensional data, including GPS information, to demonstrate increased reporting of asthma symptoms in regions affected by heat and pollen.
Advanced technologies such as these are enabling the collection of multidimensional data on patients’ molecular profiles, disease states, lifestyles, and environments. Used appropriately, this wealth of data can generate an unprecedented level of information on an individual’s health—insight that can be used to guide patient-specific treatment options.
However, these large quantities of data pose significant management challenges. The speed at which data can now be produced by modern high-throughput techniques means that many organizations find the time taken for data analysis, interpretation, and management is greater than that required for its generation in the first instance. The sheer volume and complexity of this data also leads to an increased requirement for computing power and storage capacity, resulting in a greater cost of maintaining data management systems.
In many cases, existing digital infrastructure simply cannot handle the rate at which data workflows are expanding. As a result, many data management platforms are essentially fragmented, with separate digital systems assigned to specific processes.
In order to accelerate the development of
precision medicine techniques, scientists need to be able to efficiently store, organize, and analyze this data, and also streamline the process of data generation in complex high-throughput workflows. Put simply, more advanced bioinformatics tools are now needed to realize the potential of high-throughput techniques such as NGS.5
Managing precision medicine big data using cloud-based solutions
Fortunately, advanced data management tools have been developed to address the challenges of organizing, interpreting, and sharing next-generation data. Cloud-based informatics platforms present a practical way to handle big data, using the cloud to allow essentially unlimited data storage and sharing.
In particular, these platforms are scalable solutions that allow laboratories to use a single platform to automate the process of data acquisition, to store multidimensional data in an organized and searchable format, and ultimately to perform complex data analysis. In this way, multidimensional data can be integrated into a single digital ecosystem in which structured, unstructured, and reference data can be easily searched, mined, and analyzed. As experimental data is immediately organized and available, researchers can cross-reference data across the workflow in real time, accelerating the identification of trends and enabling faster and more informed decision making.
It’s the flexible way in which cloud-based platforms organize big data that makes them such useful tools for precision medicine research. A large amount of precision medicine research relies on the use of predictive frameworks to better understand disease states and how they vary at the patient level. If all workflows and databases are integrated into a single digital ecosystem, researchers can apply the appropriate analytics that will generate the complex predictive models necessary to inform personalized treatment choices.
Many of the predictive models used for precision medicine applications are developed using machine learning (ML) approaches, and it is likely that major developments in personalized treatments will come from taking data stored in cloud-based platforms and applying artificial intelligence (AI) technology to quickly identify patterns and trends.
Multidimensional data from patients can be integrated with literature information to produce predictive models of disease,6 using ML to infer causal relationships between variables. These models enable researchers to run simulations to identify key drivers of disease, by observing the effects when a particular gene is knocked down or overexpressed. Studying these processes computationally is much quicker and less resource intensive than in vitro or in vivo alternatives, providing a useful first step in the investigation of newly identified targets.
ML approaches such as these were recently used to identify key drivers for inflammatory bowel disease in a study conducted by Sema4, a company active in the field of precision medicine. Using Thermo Fisher Platform for Science cloud-based platforms, the Sema4 team brought together vast amounts of patient data to identify trends and probe disease mechanisms.6 It’s hoped that such predictive models will prove invaluable for personalized treatment, allowing researchers to predict how an exogenous agent will influence the particular disease state in an individual patient.
Cloud-based tools: Paving the way to a new age of precision medicine
Precision medicine research has led to some remarkable developments in healthcare technology as well as treatments that have improved patient outcomes. However, the full potential of precision medicine is yet to be realized, and future progress will be accelerated through more effective harnessing of genotypic and phenotypic data. Global sharing of this information will help develop insights into genetic factors that contribute to disease development, as well as large-scale population sequencing projects will enhance a broader understanding of genetic variation.7
With access to genotypic and phenotypic data on tens of millions of patients, researchers could transform the face of healthcare. Precision medicine could lead to a future society in which health is continually measured by mobile, wearable, or implantable sensors that send data to a central hub, where the sensor data can be integrated with molecular profiling data and medical records. Predictive modeling could then be used to evaluate a patient’s risk of developing a disease and to identify appropriate preventative measures or plan individualized treatments.
There are many hurdles to overcome before this vision can be realized. However, with advances in molecular profiling supported by the integration of cloud-based platforms that enable greater use of AI technology, we are already starting to see the advances needed to make it a reality.
1. Gligorijevic V, Malod-Dognin N, Przulj N. Integrative methods for analysing big data in precision medicine. Proteomics 2015: 16(5): 741–758.
2. Chen R, Snyder, M. Promise of personalised omics to precision medicivne. Wiley Interdiscip. Rev. Syst. Biol. Med. 2013: 5(1): 73–82.
3. Kim J. Analysis of health consumers’ behaviour using self-tracker for activity, sleep, and diet. Telemed. e-Health 2014; 20: 552–558.
4. Chan Y.-F. Y. et al. The Asthma Mobile Health Study, a large-scale clinical observational study using ResearchKit. Nat. Biotechnol. 2017; 35: 354.
5. Zhang J, Chiodini R, Badr A, Zhang G. The impact of next-generation sequencing on genomics. J. Genet. Genomics 2011; 38(3): 95–109.
6. Peters LA, et al. A functional genomics predictive network model identifies regulators of inflammatory bowel disease. Nat. Genet. 2017; 49: 1437.
7. Ashley EA. Towards precision medicine. Nat. Rev. Genet. 2016; 17: 507
Nicole Rose, Genomics Application Manager for Platform for Science at Thermo Fisher Scientific.