March 1, 2010 (Vol. 30, No. 5)
Targeted sequencing has obvious potential in clinical diagnostics, particularly for complex disorders that may hinge on either one or a combination of up to perhaps hundreds of genes. The approach, however, requires enriched templates for sequencing on next-generation platforms.
To address this issue and help validate the use of parallel targeted resequencing approaches for fast and cost-effective diagnostic applications, researchers in the department of human genetics at the Radboud University Nijmegen Medical Centre are evaluating a number of enrichment technologies, including Roche NimbleGen’s sequence capture microarray platform, Agilent’s SureSelect Target Enrichment System, and RainDance Technologies’ RDT 1000 instrument and Sequence Enrichment platform. For sequencing, the team is using Roche’s 454 system, as well as the Applied Biosystems SOLiD sequencer from Life Technologies.
“The use of sequencing for diagnostics is a field in development, and there is no one ideal system at this moment,” comments Joris Veltman, Ph.D., associate professor. “We started research in this area using the NimbleGen arrays (in combination with Roche 454 sequencing), which are ideal for enriching small gene sets that will possibly be required for complex monogenic diagnostic applications. We are also evaluating the RainDance system for this targeted approach.
“For our research on larger-scale exome enrichment we are comparing the NimbleGen system to the Agilent SureSelect system (in combination with SOLiD sequencing). Results to date have confirmed that target enrichment provides high specificity and coverage and a high accuracy of SNP/mutation detection.”
Professor Veltman and a team of colleagues led by Alexander Hoischen, Ph.D., recently published validation of an array-based sequence-capture method for identifying genetically heterogeneous disorders. They selected autosomal recessive ataxia as a model disorder, and performed enrichment by hybridization on NimbleGen sequence-capture arrays, followed by sequencing with Roche 454 Titanium shotgun sequencing. The approach was evaluated on five subjects carrying known mutations and two unaffected controls.
The method involved targeting the genomic sequences of seven disease genes, together with two control loci, on a 2 Mb sequence-capture array. After enrichment, the patients’ DNA samples were analyzed using one quarter Roche GS FLX Titanium sequencing run, resulting in an average of 65 Mb of sequence data per patient, the researchers report.
This was sufficient for an average 25-fold coverage/base in all targeted regions. Enrichment showed high specificity, and on average, 80% of uniquely mapped reads were on target. Importantly, the authors claim, that the reported approach enabled automated detection of deletions and hetero- and homozygous point mutations for 6/7 mutant alleles, and greater than 99% accuracy for known SNP variants.
The researchers did observe far higher coverage for the coding sequences of the disease genes (31-fold), compared to noncoding sequences (24-fold). They suggest that such differences are most likely due to a greater level of uniqueness in coding sequences, rather than array design issues. The local sequence architecture also appeared to have a strong effect on the efficiency of DNA enrichment for individual exon, with those located near repeat-rich regions generally showing lower coverage.
Also, a comparison of SNP genotypes from massively parallel sequencing with SNP genotyping microarray studies suggested that SNP detection accuracy in next-generation sequencing data is mainly limited by sequence coverage and sequence misalignment, rather than by sequencing errors.
“Our results indicate that 15-fold sequence coverage (comparable to a PHRED quality score of 20) is required for reliable detection of heterozygous mutations, which was achieved for 93.3% of all coding exons in this proof of principle study,” researchers conclude.
The reported study also aimed to evaluate general characteristics of the enrichment procedure and, consequently, a large genomic locus, as well as the total genomic content of seven disease genes, was placed on the sequence-capture array. This resulted in a total of about 2 Mb of genomic DNA for which enrichment was performed and sequencing data was obtained. For diagnostic procedures, however, the researchers stress that it should be possible to limit analysis to just the coding and upstream noncoding exons of known disease genes. This will significantly reduce the amount of sequence to be enriched and sequenced, thereby increasing the coverage required for reliable mutation detection, they stress.
“For diagnostic applications we will develop a second-generation enrichment array containing targets for the coding sequence of all 70 genes that have been linked to ataxia. The coding sequence of these genes will still only occupy 320 Kb. The density of array probes per target can therefore be increased, resulting in even higher enrichment specificity. Because of the reduced target region, one-eighth Roche 454 sequence run will be sufficient to reach even better coverage compared to the current study while simultaneously lowering costs per patient.”
To further develop targeted resequencing approaches for routine clinical diagnostic applications, Professor Veltman’s group is involved in the EU-funded 7th Framework Program TECHGENE project (technological innovation of high-throughput molecular diagnostics of clinically and molecularly heterogeneous genetic disorders).
An industry-academic partnership, the three-year TECHGENE is focused on the development of massively parallel sequencing technologies that will be able to extend genetic diagnostics from simple monogenic disorders to more complex, genetically heterogeneous disease. Key criteria to be met include accuracy, throughput, completeness, low cost per nucleotide, and equivalence to standard Sanger-sequencing based methods.