NextBio and Intel inked a collaboration aimed at advancing the use of Big Data technologies in genomics, including optimizing and stabilizing the Hadoop stack. The two firms will apply the experience gained from NextBio’s use of Big Data applications to improve HDFS (Hadoop distributed file system), Hadoop, and HBase. Improvements made by NextBio to the Hadoop stack will be passed on to the open-source community.
Complex data that needs computational-intensive analysis requires not only Big Data open source, but optimized hardware and software management solutions to deliver scale, Intel explains. The collaboration with NextBio will work to deliver these capabilities to the Big Data community and life science industry. “The use of Big Data technologies at NextBio enables researchers and clinicians to mine billions of data points in real time to discover new biomarkers, clinically assess targets and drug profiles, optimally design clinical trials, and interpret patient molecular data,” comments Satnam Alag, Ph.D., the firm’s chief technology officer and vp of engineering.
“NextBio has invested significantly in the use of Big Data technologies to handle the tsunami of genomic data being generated and its expected exponential growth. As we further scale our infrastructure to handle this growing data resource, we are excited to work with Intel to make the Hadoop stack better and give back to the open-source community.”
NextBio has generated a platform for aggregating and interpreting large quantities of genomic and life science data for research and clinical applications. The firm claims this houses the world’s largest repository of curated and correlated public and private genomic data, including data from multiple public repositories of genomic studies and patient molecular profiles, up-to-date reference genomes, and clinical trial results. The disparate data from these resources are systematically processed, curated, and integrated into a cloud-based platform.