The first step in the Titanium Optimized Exome Sequence Capture protocol (Figure 2) is the preparation of a 454 GS FLX Titanium series library from 5 µg of randomly fragmented genomic DNA. The GS FLX Titanium library is hybridized to the NimbleGen Sequence Capture 2.1M Human Exome array.
After hybridization, unwanted genomic DNA (the regions not targeted by the array) is washed away, and the captured DNA is eluted. This post-capture DNA library is quantitated to ensure that the capture process performed to a specific quality standard. The library is then incorporated into the standard Genome Sequencer FLX System workflow at the emPCR amplification step, followed by sequencing on the Genome Sequencer FLX Instrument.
The resulting sequence data is analyzed with the GS Reference Mapper software. For sequence capture, the software application requires two input reference files: the complete human genome reference (HG18 from UCSC Genome Browser, University of California Santa Cruz), and the .gff file that describes the targeted portion of the genome enriched by the array.
The software maps all of the sequencing reads against the full human genome reference. Mapping against the full genome reference helps to eliminate false positives by determining if a sequencing read maps uniquely to the target region or if the sequencing read also maps to a region elsewhere in the genome.
To assess the reproducibility of the sequence capture and sequencing process, an experiment was designed and performed that repeated the entire process six times using the publicly available human HapMap NA11881 sample (Table). It was found that over 99% of the reads mapped to the human genome, indicating a high degree of fidelity in the capture and sequencing processes. Of those reads, ~70% mapped to the target region.
The vast majority of the reads that did not map to the target region were within the introns bordering the targeted exons and were most likely captured by probes complementary to the ends of a given exon. This result presents an additional value to researchers interested in querying variation within the exon/intron boundary regions. For some variants, uniformity of coverage and adequate sequencing depth for detection were achieved with just 4.5x sequencing coverage.