In 2005, two landmark papers describing novel cycle-array sequencing methods ushered in a new era in genetics. Known at the time as next-generation sequencing (NGS), these methods are now more commonly described as second-generation sequencing.
Although initially expected to supplant Sanger sequencing, NGS technologies have done an end run around Sanger and are instead encroaching on technologies like the DNA microarray, or staking a claim on new fields like metagenomics.
Compared to conventional Sanger sequencing, second-generation sequencing has several advantages. One is a streamlined workflow that eliminates transformation and colony picking—major bottlenecks in the process. Another is mind-bogglingly massive parallelism. Array-based sequencing can theoretically capture hundreds of millions of sequences in parallel. These changes have dramatically reduced the cost of sequencing from about $0.50 per kilobase to as little as $0.001 per kilobase. For read length and sheer accuracy, however, Sanger sequencing still rules.
With 30 years of technology development behind it, a Sanger system can produce read lengths of 1,000 bp, with nearly perfect accuracy, whereas second-generation challengers tend to achieve read lengths of less than 100 bp, with roughly ten times the number of inaccurate base calls.
Second-generation sequencing technologies are still very much on the steep portion of the development curve, so improvements in accuracy and read length can be expected on a regular basis for years to come. In the meantime, however, second-generation sequencing has been embraced for myriad applications in which expensive Sanger sequencing would be out of the question.
At CHI’s “Exploring Next Generation Sequencing” meeting to be held later this month, a number of speakers are slated to present real-world next-generation sequencing results—a blossoming of the technology pioneered just four years ago.
Fishing for RNA Editing Sites
Representing the Church Laboratory at the Harvard Medical School, Jin Billy Li, Ph.D., will present recent studies using the Illumina sequencing platform to do targeted sequencing of RNA editing sites. Targeted sequencing is an increasingly popular practice to further reduce the costs associated with sequencing. Rather than sequencing a whole genome or a whole library, targeted pieces are selected based on the nature of the study.
In this case, Dr. Li and his colleagues are studying adenosine-to-inosine RNA editing (inosine reads as guanosine), which diversifies the human transcriptome and has been implicated in brain function. Tissue from the cerebellum, frontal lobe, corpus callosum, diencephalon, small intestine, kidney, and adrenal gland from a single individual contributed the cDNA and gDNA for the study.
Screening more than 36,000 computationally predicted A-to-I sites, obtaining 57.5 million reads, they identified 500 new editing sites. “We have moved the RNA editing field forward a lot,” Dr. Li notes. “We’re moving from 20 or 30 sites to hundreds of new editing sites. We’re very excited with that piece of work.”
Previous efforts at identifying RNA editing sites have focused on conserved regions of the genome. Dr. Li’s study differs in that it used an unbiased approach, including conserved and unconserved locations enriched with RNA editing sites. Their results confirm previous observations that the primate lineage is enriched with RNA editing sites. Access to second-generation sequencing technology allowed this group of scientists to cast an extremely wide net, picking up 50 times as many instances of RNA editing as had ever been found before.