Researchers at Nationwide Children's Hospital say they have developed an analysis pipeline that cuts the time it takes to search a person's genome for disease-causing variations from weeks to hours. An article (“Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics”) appears in Genome Biology.
“It took around 13 years and $3 billion to sequence the first human genome,” notes Peter White, Ph.D., principal investigator and director of the biomedical genomics core at Nationwide Children's and the study's senior author. “Now, even the smallest research groups can complete genomic sequencing in a matter of days. However, once you've generated all that data, that's the point where many groups hit a wall. After a genome is sequenced, scientists are left with billions of data points to analyze before any truly useful information can be gleaned for use in research and clinical settings.”
To overcome the challenges of analyzing that large amount of data, Dr. White and his team developed a computational pipeline called “Churchill.” By using novel computational techniques, Churchill allows efficient analysis of a whole genome sample in as little as 90 minutes, explains Dr. White.
“Churchill fully automates the analytical process required to take raw sequence data through a series of complex and computationally intensive processes, ultimately producing a list of genetic variants ready for clinical interpretation and tertiary analysis,” he continues. “Each step in the process was optimized to significantly reduce analysis time, without sacrificing data integrity, resulting in an analysis method that is 100 percent reproducible.”
The output of Churchill (GenomeNext has licensed its algorithm) was validated using National Institute of Standards and Technology (NIST) benchmarks. In comparison with other computational pipelines, Churchill was shown to have the highest sensitivity at 99.7%; highest accuracy at 99.99%, and the highest overall diagnostic effectiveness at 99.66%.
“At Nationwide Children's we have a strategic goal to introduce genomic medicine into multiple domains of pediatric research and healthcare. Rapid diagnosis of monogenic disease can be critical in newborns, so our initial focus was to create an analysis pipeline that was extremely fast, but didn't sacrifice clinical diagnostic standards of reproducibility and accuracy” says Dr. White. “Having achieved that, we discovered that a secondary benefit of Churchill was that it could be adapted for population scale genomic analysis.”
“Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours,” wrote the investigators. “The method is highly scalable, enabling full analysis of the 1,000 Genomes raw sequence dataset in a week using cloud resources.”