Recent reports in Science illustrate the range of applications enabled by sequencing technologies pioneered by Roche (www.roche.com) and 454 Life Sciences (www.454.com), now a Roche company. In one study (Science 2007; 317:1927-1930), researchers performed whole genome shotgun sequencing of mitochondrial DNA extracted from the ancient Siberian mammoth to examine population diversity.
Each single-stranded DNA template was captured on a bead and amplified within an individual emulsion droplet. The beads were collected on a PicoTiterPlate™ at a density of 1.9 million beads per plate and sequenced on Roche’s Genome Sequencer (GS) FLX instrument.
In another report (Science DOI: 10.1126/science.1149504), researchers used the GS FLX to detect structural variations in the human genome caused by additional, lost, or rearranged chunks of DNA. This type of genetic analysis, combined with SNP data, will help researchers paint a broader picture of the newly discovered variability present in the human genome.
Structural variation appears to be more prevalent than previously thought, and examples of evolutionary conservation based on comparisons with early human genomes will contribute to an understanding of their role.
Tim Harkins, marketing manager for the 454 Sequencer at Roche Applied Science, says that these papers, together with a report linking gene expression with maternal behavior in the wasp (Science DOI: 10.1126/science.1149504), “demonstrate the flexibility of the platform” and the diversity of applications enabled by high-throughput sequencing, as each paper targets a different biological question.
“There is quite a bit of hype in the market regarding instrument specifications for next-generation sequencing and how much data you can get from an instrument run—total data versus usable data,” observes Harkins.
During parallel sequencing some reads will go awry and generate spurious data, which the Roche system filters out using a variety of quality control metrics, according to Harkins. Users should compare instrument systems based on usable data figures and understand “coverage models,” Harkins emphasizes. “Coverage model refers to how efficiently a sequencing read can be used either to detect a genetic variant or assembly of a genome—is one read needed or 100 reads?”
The first-generation 454 system was able to sequence 20 mb/run. Improved versions of the sequencer now output 100 mb/run in less than eight hours, Harkins reports, and by mid-2008 the company expects to introduce upgrades that would push that figure to 1 gb/run in less than 24 hours using the same instrument system and improved reagents.
Initial read lengths were approximately 100 bp and presently average about 250 bp, with published read lengths surpassing 400 bp by mid-2008, he adds.
The first iterations of next-generation sequencing platforms now available from companies such as 454 Life Sciences, Illumina (www.illumina.com), and Applied Biosystems (ABI; www.appliedbiosystems.com) represent the tip of the iceberg in terms of potential for future improvements in sequencing throughput and cost.
In addition to stepwise improvements that will accompany the evolution of individual technologies, there will be “disruptive spurts of improvement in either cost or throughput, or both,” in the view of Susan H. Hardin, Ph.D., president and CEO of VisiGen Biotechnologies (www.visigenbio.com).