In 2007, Richard A. Gibbs, Ph.D., professor and director of the human genome sequencing center at Baylor College of Medicine, together with colleagues from Roche NimbleGen published one of the first reports on the use of solid-based hybridization based enrichment of human genomic regions by programmable custom high-density oligonucleotide microarrays. “We have incrementally improved our reagents since then,” says Dr. Gibbs.
Most recently, Dr. Gibbs and colleagues described a liquid-phase hybridization platform that uses biotinylated oligonucleotide probes, and introduced additional design changes to include more genes than the narrow consensus coding DNA sequence (CCDS) set, which frequently guides the design of custom probes but excludes many computationally predicted or actual coding exons present in other databases.
To expand the regions examined during target enrichment, the investigators included two new reagents. The first one, VCR-set, includes microRNAs, Vega (the Vertebrate Genome Annotation Database), CCDS, and the RefSeq databases. The second capture design reagent, REC-set, additionally includes regulomes, exons, and conserved elements. By using these reagents, Dr. Gibbs and his team conducted the first genome-wide targeted capture analysis of a diverse set of biologically relevant genomic elements, and revealed decreased capture of variants located outside the CCDS regions as compared to the CCDS exome.
The results also showed that conserved untranslated regions, which are approximately 30% GC rich, and regulatory regions, which are approximately 70% GC rich, had approximately half of the depth sequence coverage following the capture procedure when compared to the CCDS regions, demonstrating the need to increase coverage in genomic regions that are different from CCDS.
At the biological end, Dr. Gibbs and colleagues are applying these advances toward the discovery of disease alleles and the study of rare genetic variants. “Having illustrated the robustness of this approach in the research arena, we are on the verge of developing this into a diagnostic test in the clinical arena.”
Solution-based hybridization approaches are often more convenient than solid-phase arrays, and offer the additional advantage of being easier to multiplex. “An important improvement from our point of view, particularly as we have been using the Illumina technology for next-generation sequencing, is that we are using TruSeq, the new library preparation system that Illumina has developed,” says Ann-Christine Syvänen, Ph.D., professor of molecular medicine at Uppsala University.
While many sequencing efforts focus on capturing exomes, a significant amount of genetic variation occurs outside protein-coding regions. “It would be important to also analyze other genomic regions, and the new enrichment probes from Illumina contain some extra sequences in regulatory regions very close to genes, which add information content in addition to multiplexing.”
These short regions that flank the genes allow the detection of regulatory variants located in their vicinity. In addition, genomic variation may also come from gene regulatory elements located further off from open-reading frames. This source of variation is also functional, but it will be missed by standard exome arrays.
Target enrichment and whole-genome sequencing emerge as two equally important strategies, and each of them is best powered to address specific biological questions. As these approaches are incrementally improved, optimized, and validated in research settings, they promise to materialize into exciting diagnostic and therapeutic applications and to provide powerful tools to interrogate other biological questions.