Patricia F. Fitzpatrick Dimond Ph.D. Technical Editor of Clinical OMICs President of BioInsight Communications
Drug developers see advantages in handling and interpreting data generated by large-scale next-gen sequencing.
The cost of DNA sequencing has plummeted in the last decade since the human genome was published. Expenses are dropping by about 50% every five months. Last August the National Human Genome Research Institute (NHGRI) reported that researchers had received more than $24 million in grants to develop sequencing technologies that are expected to rapidly sequence a person’s genome for $1,000 or less.
But none of that sequence data helps anyone if it is inaccessible, uninterpretable, or isn’t linked to a functionality. As genomics starts to inform proteomics and personalized medicine, it’s going to take massive amounts of computer power for storage capacity, data management, and analysis. The amount of data generated could reach the petabyte range; A petabyte is a unit of information equal to 1,000 terabytes, or 1 billion megabytes.
The petabyte crisis and the need to use all that data to discover novel therapeutics provide ready-made forcing functions to move companies into the cloud. Firms are leveraging cloud-based resources to bolster storage and analysis of the huge amounts of data generated from research and clinical development involving next-generation sequencing.
Key questions being asked by professionals are whether there will be sufficient storage capacity for all that information in healthcare systems, will the information be routinely searchable as new and potentially relevant discoveries are made, and will it be possible to routinely share structured genomic data across healthcare systems. Another big question is will it be possible to make the best clinical decisions for a patient in the context of all that data.
NGS Research
Cloud computing companies have recognized the opportunity to answer these questions and are putting down big stakes. On October 12, Google Ventures co-led a $15 million round of financing for sequencing informatics specialist DNAnexus. Using Google Cloud Storage, the company expects to provide a long-term solution for researchers who require access to the vast repository of DNA sequencing data contained in the public Sequence Read Archive (SRA) database.
The SRA database was scheduled for shutdown by the NIH along with several other bioinformatics databases. As a part of its initiative with Google, DNAnexus will provide a freely accessible web-based search interface to simplify searching and accessing these datasets as well as improve their usability for life science research. Through this interface, researchers can also download sequence read files including all sequences from the 1,000 Genomes Project.
Users of the DNAnexus SRA website can also import SRA datasets into the commercial DNAnexus platform to access additional functionality such as mapping, RNA-seq, ChIP-seq, variant analysis, and data visualization, as well as tools for integrating SRA data with their own sequence data.
“The DNAnexus SRA website is an example of a ‘big data’ initiative that benefits from rethinking the interface in a 100% web-enabled world,” said Eric Morse, head of business development, Google Cloud Storage. “By combining Google’s massively scalable data storage infrastructure with DNAnexus’ expertise in web-based interfaces, genomics data analysis, and visualization, researchers can quickly access the world’s genomic information from any web browser.”
Clinical Investigation
On November 10, Dell moved clinical practice in oncology closer to the cloud. It committed cloud computing technology, funding, and employee engagement to support pediatric cancer research programs. It is expected to speed computational processes, manage and store the resulting data, and provide a forum for analytics and collaboration.
The company’s foray into biomedical research will focus on a personalized medicine clinical trial being conducted by the Neuroblastoma and Medulloblastoma Translational Research Consortium (NMTRC) and supported by The Translational Genomics Research Institute (TGen).
TGen will use its genomic technology within Dell’s donated cloud to help NMTRC identify a greater depth of personalized treatment strategies for children with stage IV neuroblastoma who are enrolled in the NMRTC trial. The study is being conducted in at least 10 children’s cancer centers in the U.S.
Centers will send tumor samples to TGen, and “T-Gen will run the gene sequencing on the individual patient tumor samples sent by the participating institutions,” James Coffin, Ph.D., vp and GM of Dell Healthcare and Life Sciences, explained to GEN.
The information will then be sent to the cloud for matching with drug(s) that have the highest success in treating the cancer and are impacting the most appropriate pathways. “In pediatric care in particular, you want to make sure you are using the most efficacious, least toxic drug,” said Dr. Coffin. “The way to do that is to match the tumor through a complete genomic tumor analysis from an individual patient, and then we know what pathways can be blocked by which drugs.”
Up until now, the trial was primarily supported by parents and foundations. Dr. Coffin said that the first trial will enroll 13 pediatric patients with stage IV neuroblastoma, but with DELL’s support, the program’s participation is expected to grow to hundreds of patients over the next three years.
This will generate more than 200 billion measurements per patient that must be analyzed, shared, and stored, DELL predicted. Data computation and analysis of this information would have required weeks to months to process and thus would have limited the depth and number of pediatric cancer patients who could be included in the clinical trial. But DELL expects that the time needed for such large-scale studies will be reduced to just days through the use of DELL’s cloud.
Changing Healthcare Paradigm
Feasibility for this approach has been established by the trial’s lead investigator Giselle Scholler, M.D., pediatric oncologist at the Van Andel Research Institute (VARI). Dr. Scholler’s team evaluated the use of predictive modeling based on genome-wide mRNA expression profiles from neuroblastoma tumor biopsies to make real-time treatment decisions.
Upon execution of five analytical methodologies, an OncInsights™ report was generated. These analytics included biomarker expression, drug target expression, network target activity, drug response signature, and drug sensitivity signature. The interactive report allows the physician and reviewing tumor board to quickly navigate the underlying knowledge and evidence at multiple levels.
While the total drug pool available to this study currently comprises 182 FDA approved compounds, only those with established pediatric dosing were used. The pilot study results showed that all reports could be generated, a tumor board held, and an individualized treatment plan agreed to and approved by a medical monitor in ≤12 days.
DELL has high hopes that its involvement in the large-scale trial and its cloud offering will provide the needed computing power to help increase TGen’s gene sequencing and analysis capacity by 1,200 percent.
Such platforms have the ability to improve collaborations between the team of physicians for a patient, pharmacists, genetic researchers, and computer scientists working on the trial. Academic scientists, clinicians, and computer companies are responding to the potential of the cloud by ramping up their partnerships to store, analyze, integrate, and share data. For healthcare the cloud is a key enabler of the information exchange that will allow the leap from episodic care to complete wellness management.
Patricia F. Dimond, Ph.D. ([email protected]), is a principal at BioInsight Consulting.