As traditional chain terminator or Sanger DNA sequencing shifts toward next-generation and next-next generation approaches, reads are becoming available at increased throughput and lower costs. Applying these methods to cDNA libraries has catalyzed the emergence of RNA-Seq, a new tool that enables an unprecedented level of visualization of the eukaryotic transcriptome.
RNA-Seq is informative about multiple levels of the genome transcription, including DNA sequence, gene expression, and mRNA splicing, which cannot be interrogated by using any other single approach.
“One of the key things about RNA-Seq is that the information it provides cannot be obtained from genomic DNA sequencing,” says Joshua Z. Levin, Ph.D., research scientist and group leader in the genome sequencing and analysis program at the Broad Institute.
One of the challenges associated with whole transcriptome sequencing is the different level at which each transcript is present. While good coverage can be obtained for highly expressed transcripts, this is more difficult in the case of low-level transcripts for which resequencing is needed to achieve sufficient coverage and to ensure that an observed change is relevant and not merely a sequencing error.
“We need to have enough coverage, and we often need to see specific regions sequenced 10 to 20 times at a given position to be confident that the sequence is there, particularly when a genetic variant could exist in a heterozygous form,” says Dr. Levin.
In an approach known as targeted RNA-Seq, Dr. Levin and colleagues combined next-generation sequencing with the hybridization capture of relevant transcriptome subsets. The authors revealed, in a study that used oligonucleotide probes specific for 467 cancer-related genes from a tumor cDNA library, that targeted RNA-Seq increased the coverage of low-abundance transcripts to levels that allowed sequence changes to be easily detected.
Targeted RNA-Seq provides information that cannot be obtained by any other single method. Learning about alternative splicing, identifying novel fusion transcripts, quantitating transcript levels, and providing insights into allele-specific expression are some of its greatest advantages. When two different alleles co-exist, DNA sequencing alone cannot determine which one is expressed and to what proportion, but this information may be very relevant clinically.
An area that has been profoundly impacted by targeted resequencing is anthropology. Svante Pääbo, Ph.D., director in the department of genetics at the Max Planck Institute for Evolutionary Anthropology, and colleagues recently revealed, in studies that examined bones from the Vindija cave in Croatia, that Neanderthals, the closest evolutionary relatives of present-day humans, share more genetic variants with European and Asian populations than with individuals from sub-Saharan Africa.
One of the major challenges that investigators faced is the extensive contamination of the DNA of interest with microbial DNA. “The high proportion of microbial DNA made shotgun sequencing impractical,” says Dr. Pääbo. By using targeted resequencing on ~49,000-year-old Neanderthal bone remains obtained from the El Sidrón cave in Spain, Dr. Pääbo and colleagues were able to successfully recover more than one megabase of target DNA regions.
“The targeted approach and the capture approach are really important in ancient DNA research, not only because they allow the capture of a certain part of the Neanderthal genome but also because the Neanderthal DNA represents approximately ~0.2 percent of the total DNA we have,” explains Dr. Pääbo.
The investigators interrogated approximately 14,000 protein-coding positions and identified 88 amino acid substitutions that became fixed in the human population since divergence from Neanderthals, illustrating the strength of targeted resequencing in shedding light on human protein evolution and in obtaining information from heavily contaminated ancient DNA samples.
By relying on several experimental approaches, including array-based and solution-based capture, Dr. Pääbo and colleagues are working on a survey of all the single-copy parts of chromosome 21. “We are seriously working on scaling up these techniques to eventually try to capture the whole single-copy component of the Neanderthal genome.”
While PCR has long been able to explore specific chromosomal regions, targeted resequencing not only provides insight about the chromosome at a much larger scale but also comes with additional advantages. For example, PCR amplification, which uses two primers, is very sensitive to mutations that could exist in the target at the sites where the primers anneal. “When things are captured by hybridization, the whole target is being used. Substitutions in the target have, therefore, a much smaller impact, and cause fewer problems than old-fashioned PCR that we used to perform,” explains Dr. Pääbo.
Targeted resequencing provides a more unbiased perspective of the genomic regions of interest and is not so sensitive to polymorphisms or other genetic variants.
There is no question that targeted resequencing is having an impact. It is important to appreciate the advantages that each individual technique offers, the multiple layers of information that can be interrogated by combining different approaches, and the fascinating biological questions that find answers and fuel even more questions, illustrating the dynamic nature of scientific inquiry.