GEN UPDATES in biotechnology: Sequencing
Quest for Technologies to Cut DNA Sequencing Costs
Carolyn Riley Chapman, Ph.D.
Upon completion of the human genome sequence in 2003, a vision for the future of genomics research was put forth by the U.S. National Human Genome Research Institute (NHGRI) and published in Nature. In it, the NHGRI called for researchers to develop technology that would allow sequencing of a human genome for $1,000. This is no small order. Many different groups, however, both commercial and academic, have taken on this challenge. With one race barely just finished, a new one is already under way. In the fall of 2004, the NHGRI awarded grants to spur the development of lower-cost DNA sequencing technologies. Representatives from Microchip Biotechnologies, Agencourt Bioscience, 454 Life Sciences, Li-Cor, and Stephen R. Quake, Ph.D., of Stanford University were among the awardees developing near-term methods to sequence a human-sized genome for $100,000. According to the NHGRI, "there is strong potential that five years from now, some of these technologies will be at or near commercial availability." Microchip Biotechnologies’ approach is unique among awardees in that it plans to extend current Sanger sequencing methods. "Big capillary-based DNA sequencing analyzers are nearing the end of their technological life cycle," asserts Roger McIntosh, vp engineering. "The next major move is to replace big, clumsy capillary sequence analyzers with a chip-based system, which solves several problems simultaneously. What we’re doing is an evolutionary progression of the same chemistry. The key innovation is in reducing volumes and sizes and in automation." Compared with efforts based on new technologies, "the risk is much reduced." "To get to a $100,000 genome you have to eliminate all typical consumables," says Stevan Jovanovich, Ph.D., president and CEO. "Even if you use only one pipette tip per read, that adds up. Another major cost is labor, so the system has to be totally automated. This matches up nicely with the microfluidic approach." In Microchip’s technology, individual DNA fragments are attached to a bead surface and amplified. Sample preparation, separation, and detection will be completely integrated and automated, taking place on a microfluidic chip (a six-inch round glass wafer with etched channels). Lots of little pieces and technology elements need to be developed and tied together. Some are being worked on by the Mathies lab at the University of California. "Another element is to develop new gels or separation matrices, suitable for use in tiny channels. People have used gels in capillary machines, but they’re not entirely appropriate for use on chips. This is being addressed by the Barron lab at Northwestern," says McIntosh. "When people think of automated equipment, they think of robotic arms. But now when we talk about it at this level, actuating tiny elements on a chip under computer control, the inherent reliability goes up. The technological leap is comparable to going from vacuum tubes to transistors in electronics. We’re going from big, large-scale robots to automated systems on a chip," says McIntosh. |
Sequencing by Synthesis
Agencourt Bioscience claims it has one of the largest commercial sequencing facility in the world. Gina L. Costa, Ph.D., director of new technology development, says that the company’s extensive DNA sequencing infrastructure and existing customer base provide advantages for advancing internal technology development. Like other awardees, Agencourt is taking a sequencing-by-synthesis approach, in which measurement takes place as nucleotides are incorporated into DNA. The firm’s technology is based on polony sequencing, short for polymerase-generated colony, originally developed in the laboratory of George Church, Ph.D., at Harvard University. Agencourt has optimized methods for bead-based polony sequencing. In this system, clonal populations of short DNA fragments are amplified onto beads, then packed tightly onto a slide containing a gel matrix. In a flow cell, reactants flow over the slide to allow DNA synthesis to take place." Hundreds of millions of DNA beads can be sequenced in each run," says Dr. Costa. "Our technology speeds up conventional sequencing by 100-fold and will be the equivalent of 500 to 1,000 ABI 3730xl instruments a day." Agencourt’s methodology, like other nonsingle molecule sequencing by synthesis approaches, affords read lengths of only 50 to 150 base reads. This makes sequence assembly more challenging. To get around this, Dr. Costa says the company uses tricks to get paired-end information across short reads, which allows researchers to put short segments of nucleotides together like a jigsaw puzzle. "What really sets us apart is our simple and robust biology," Dr. Costa says. "Our system is not complicated. Simplicity and consistency are important attributes." 454 Life Science is "refining and advancing the performance of sequencing by synthesis," according to Marcel Margulies, vp engineering. He says 454’s technology is extremely high-throughput, both in terms of sequencing reactions and front-end sample preparation. "We perform sequencing by synthesis on solid support and on 400,000, 500,000, and even 700,000 clonally amplified fragments simultaneously," he says. "Entire floors of robots" can be devoted to sample preparation for conventional sequencing. "With our approach, irrespective of the size of the genome, you can do library preparation with one person, with a single sample preparation, in a couple hours," says Margulies. DNA is amplified and bound to beads, which are then deposited into a fiberoptic plate with 1.6 million wells. Because of size exclusion and a 2.5 times oversupply of wells, only one bead gets deposited into each well. Nucleotides are then added in sequence. When incorporated, inorganic pyrophosphate is released. "That release is observed through emission of light that is captured. Measurement takes place at each incorporation. The beauty is that if you have a homopolymeric stretch, all will incorporate at the same time, and you get five times as much light," explains Margulies. 454 generates a tremendous amount of data because of the number of reads performed simultaneously, according to Margulies. To address this, they have outfitted computers with FPGA (field programmable gate array) chips. "The entire computer takes advantage of this to do data processing in real time. The way we designed the system, at the end of the run, we have reads that have been corrected and quality scored." 454 has also developed its own assembly software. "Typically
people convert signal to nucleotide letters and then use the letters
to do all the manipulations. We generate that as well, so one could
use that information in conventional assemblers or mappers. But
this may not ultimately be the best way of doing things. We have
found that if you use signals themselves to find homologies as opposed
to the letters, you can do a better job," Margulies explains.
|
Single molecule sequencing by synthesis. Many DNA or RNA strands (templates) are immobilized in a random fashion on a substrate. Primers containing a Cy3 fluorescent label are hybridized to the templates. The substrate is illuminated with a green laser, and the location of primer/template duplex is detected and recorded. The substrate is incubated with a nucleotide (e.g., G) labeled with fluorescent Cy5 and a DNA polymerase. Free nucleotides and polymerase molecules are washed out, and the incorporation of the labeled nucleotide is detected upon red laser illumination. Red spots (e.g., incorporated G nucleotides) are correlated with green spots (primer/template duplexes) to determine the sequence of the templates. This process is repeated with every nucleotide for as many cycles as required to achieve the desired read-length.
Infrared Fluorescence
Li-Cor says that it pioneered the development of infrared fluorescence labeling and detection systems for DNA sequencing. The company is now pursuing single molecule DNA sequencing using charge switch dNTPs. "When a nucleotide is incorporated into DNA, pyrophosphate is cleaved from the nucleotide," says John Williams, Ph.D., principal scientist. While dyes are typically attached to the base moiety, in our case the label is on the gamma phosphate so that the released pyrophosphate is labeled." One of the greatest challenges in developing single molecule sequencing is in imaging. "With a single dye molecule, the signal-to-noise ratio is low. To get around error, you have to oversequence. This drives the cost back up," says Dr. Williams. One way Li-Cor is working to get around this problem is by sequencing "using brighter labels giving a stronger signal," explains Dr. Williams. Li-Cor’s methodology, while still under development, holds advantages over other approaches, because "there is no limit to read length except the length of the template," adds Dr. Williams. Because electrophoresis cannot differentiate size differences of only one nucleotide of longer DNA sequences, read lengths are currently limited to about 1,000 base pairs. Nonsingle molecule sequencing by synthesis technologies have inherent limitations to read length as well. Since sequencing reactions are not 100% efficient, the population of lengthening DNA strands eventually gets too far out of phase for accurate sequence determination. Long read lengths are important for two reasons. For one, it makes
sequence assembly of the genome easier, since there are fewer pieces
to the puzzle. Another advantage is that you get haplotype information.
"The single molecule approach lets you sequence each chromosome.
You are able to see both chromosomes and how they differ from each
other. If you had 100 base pair reads, you wouldn’t know which
polymorphisms go with which chromosomes," adds Dr. Williams.
Practical Sequencing PlatformStephen R. Quake, Ph.D., professor of bioengineering at Stanford University, was another grant recipient focused on single molecule sequencing by synthesis. Helicos BioSciences, founded in May 2003, is working to create "a practical sequencing platform from my group’s technology," Dr. Quake boasts. Stan Lapidus, president and CEO of Helicos, believes that "moving away from the Sanger sequencing technology breaks the price and throughput bottleneck" for whole genome sequencing. The Quake/Helicos technology relies on the detection of fluorescence
resonance energy transfer on a total internal reflection microscope.
"On a single 10-cm proprietary substrate, we sequence billions
of single strands of DNA in a single experiment," says Lapidus.
The hope is to be able to sequence a whole genome in days using
the methodology; eventually it may take only hours. In a sense, the pictures become cross sections of the ultimate DNA strand, in that the full sequence is realized by stacking all the pictures on top of each other. "Our advantage is that we can do high densities. We expect to routinely reach a density of one million molecules per square millimeter and ultimately reach 10 million molecules per square millimeter." The company plans to have a demonstration project by the end of 2005. It’s not yet clear when and if any of these technologies
will allow consumers to get a human genome sequenced for $100,000.
And maybe more than one group will need to succeed to really get
prices to drop. "Competition and market pressure are going
to push the price down," Dr. Jovanovich says. What is clear
is that many groups are hard at work, and if DNA sequencing does
become more affordable, modern biological research and the practice
of human medicine will continue to undergo rapidfire change. |
Miniaturized and massively parallel, the 454 Life Sciences Genome Sequencing System works at picoliter levels to produce 20 megabases in a 4.5-hour run.
This article previously appeared in the March 1, 2005,
issue of GEN.
Web: www.genengnews.com.

