Tracking Bacterial Transcriptomes
Bias is an important consideration in many types of genetic studies. Hybridization-based techniques suffer from bias because of differences in hybridization efficiencies of oligonucleotides. When Nicholas Bergman, Ph.D., assistant professor in the school of biology at Georgia Institute of Technology, wanted to study bacterial gene expression, he wanted to look at the bacterial transcriptome in an unbiased way. That led him in the direction of second-generation sequencng, rather than the conventional DNA microarray.
Hundreds of bacterial genomes have been sequenced by now, and more genomes are under way. The large body of data, however, does not reveal much information about how the genes are expressed, how they are regulated, or how they are linked.
Dr. Bergman used second-generation sequencing technologies—Solexa from Illumina and SOLiD from Applied Biosystems, a division of Life Technologies—to probe the functions and relationships of the Bacillus anthracis transcriptome.
“We don’t know where those operon boundaries lie,” he says. “It’s surprisingly difficult to predict, we would like to understand that.” The result was a high-resolution map of the gene families within the bacterial genome. “Right off the bat, we could see transcript structure.”
As in the case of Dr. Li’s work with RNA editing, the second-generation sequencing technology enabled Dr. Bergman and his colleagues to make a significant leap forward in the field of bacterial transcriptomics. Previously, the best-mapped bacterial transcriptome was E. coli with 1.5% of transcripts mapped. In a single experiment, the Georgia Institute of Technology group captured 60% of transcripts from B. anthracis.
Dr. Bergman used a standard microarray approach to validate the experiment, with closely corresponding results. “For the most part they matched extremely closely, and they differ really only in instances where the arrays are not able to make an accurate measurement.”
Although it is a major accomplishment to be able to map the majority of transcripts in a bacterial genome, another significant aspect of this research is what it means for the DNA microarray. “The place I can speak in the most informed way is gene-expression analysis by sequencing. On that topic, I don’t see a long-term future for microarrays. The only thing holding us back is cost. Costs have been coming down steadily.”
Probing Mutations in Cancer
At the Broad Institute, researchers are looking for genetic causes of glioblastoma. Stacey Gabriel, Ph.D., director of the genome sequence and analysis program, will be presenting the results of their search using next-generation sequencing methods. The goal is to identify different kinds of genetic changes that may cause the cancer, such as point mutations, structural rearrangements, and other events in the genome.
In the past, a variety of methods would also have been used, including microarrays and conventional sequencing. Second-generation sequencing offers the benefit of addressing many questions in a single experiment. “There are things we can find with sequencing that we wouldn’t have found before,” Dr. Gabriel says, adding, “we’ve found rearrangements in the glioblastomas, for example, that we weren’t able to see on the SNP arrays. There are also interesting transformations of some genes that were new to us. In ovarian cancer we’re finding, using a combination of point mutations and structural rearrangements, new pathways important in the cancer.”
One of the key strategies of their approach is targeting only the coding portion of the genome, which is just 1–2% of the overall genome. They utilize Solexa, using a method invented at the Broad Institute, and work in collaboration with Agilent Technologies. “We call it hybrid selection. It’s a technology that uses long oligos that Agilent synthesizes on arrays. The long oligos are cleaved off the arrays and capture corresponding parts of the human genome. We’re able to isolate the part that gets captured and sequence that.”
The most powerful of these is a set of all human exons—from all 20,000 human genes. That allows the scientists to target every gene in one experiment. “The power of using next-generation sequencing and the richness of the information that we’re getting now, in an almost routine way, is impressive,” adds Dr. Gabriel. “There’s still a long way to go to really work out the best and most sensitive and specific analysis. I think that some of the errors that are created in this data are still poorly understood. One of the main challenges right now is increasing the accuracy of our interpretation of data.”
Matthew Ferber, Ph.D., is a codirector of the clinical molecular genetics laboratory at Mayo Clinic. His group is working on hereditary colon cancer. They start with 22 colon cancer genes on a Roche NimbleGen (Roche Applied Science) capture array, then elute the DNA and submit it for second-generation sequencing on the 454 and Solexa platforms.
The goal of this initial experiment was to compare the two platforms to see which is most useful for diagnostic purposes. “Each company right now has its strengths and weaknesses,” Dr. Ferber explains. Ultimately, the comparison seems pretty much a wash, and Dr. Ferber has not yet made a final decision.
“At the end of the day, the answer no clinical lab wants to hear is, you’ll have to sequence with both platforms to get a clear clinical picture. The economics of this for a clinical laboratory becomes untenable. There must be a convergence of the technology that allows for high throughput, short run time, long reads, all at an economical price.”
Second-generation sequencing has developed rapidly since its introduction in 2005. Now appearing in research, industrial, and clinical laboratories, it is finding new uses and new applications, and in some cases supplanting older technologies.