June 1, 2009 (Vol. 29, No. 11)
Richard A. A. Stein M.D., Ph.D.
As the Features of Individual Cells Are Identified, Their Functioning as a Group Is Illuminated
Approximately 5×1030 bacteria are estimated to inhabit our planet, and microorganisms colonizing our bodies outnumber our own cells by a factor of 10. Nevertheless, we know relatively little about our unicellular neighbors. A widespread approach used to explore microbial genomes is to culture the organism and analyze its genetic material, however, it is estimated that 99% of the microorganisms fail to grow in culture.
Sequence analysis of the 16S ribosomal RNA gene provides another powerful tool for the molecular identification of thousands of bacteria from complex samples. Fragmenting and sequencing total DNA extracted from an environmental sample, known as shotgun sequencing, reveals information on the many organisms present, but assembling individual discrete genomes is tremendously difficult, if possible at all.
A more recent approach involves sequencing DNA derived from one bacterial cell at a time. While these sequences derive from a single genome, amplifying the few femtograms (10-15 grams) of DNA in a cell to obtain sufficient material for sequencing presents considerable challenges.
“A key technology, multiple displacement amplification (MDA), developed almost ten years ago, was a huge advance compared to what had been done before,” says Roger Lasken, Ph.D., professor at the J. Craig Venter Institute. “MDA was a breakthrough in whole-genome amplification and, for the first time, we were able to amplify major portions of the genome from a single cell.”
MDA relies on bacteriophage Φ29 DNA polymerase, an enzyme with strong strand-displacement activity and high processivity. This approach uses random primers to synthesize amplicons with an average length of 12 kb and exhibits much less amplification bias than older PCR-based whole-genome amplification methods.
“I think MDA is going to be an important method for discovering uncultured bacteria. It is going to allow us to obtain bacterial cells from many different environments and also from human clinical specimens, and sequence their genomes without needing to develop culture methods,” continues Dr. Lasken.
Together with collaborators, Dr. Lasken tested DNA sequencing of Borrelia burgdorferi, the causative agent of Lyme disease, on single cells captured from the tick midgut by micromanipulation with a glass capillary. Alternative methods such as shotgun sequencing or PCR analysis from whole tissue would not have been informative on whether the sequences came from the same cell or from different ones.
Single-cell genomics will benefit not just unculturable microorganisms, but the ones that form colonies in culture as well.
Although bacterial colonies are derived from single cells, culturing may select fast-growing variants or counter-select certain virulence factors, and such genotypic changes might not accurately reflect the in vivo behavior of the microorganism. For example, some bacteria such as Neisseria gonorrhoeae or Helicobacter pylori lose certain virulence factors during in vitro culturing. “This is the reason, even in the cases when you can culture the pathogen, to sequence from single cells without culturing in order to investigate genotypes relating to the disease,” adds Dr. Lasken.
An essential aspect of genomics is the choice of DNA polymerases. To efficiently amplify sequences from heterogeneous microbial mixtures, an ideal enzyme should combine several characteristics such as high processivity and strand displacement activity, thermostability, and increased amplification fidelity. Thermophilic phage polymerases appear to exhibit several of these qualities and provide promising tools for single-cell applications.
“The promise of single-cell genomics will not be realized until the right DNA polymerase is developed that eliminates bias at the single molecule level, does not suffer from AT amplification bias, is compatible with heat lysis of cells, does not produce artifacts such as strand-switching chimeras or primer dimers, and can generate fewer branched molecules and more dsDNA,” says David Mead, Ph.D., CEO of Lucigen.
Recently, Thomas Schoenfeld, vp of enzyme discovery at Lucigen, examined viral populations inhabiting two hot springs, Bear Paw and Octopus, from Yellowstone National Park. From the over 200 polymerase sequences identified in the two viral metagenomes that Schoenfeld and his collaborators generated, two classes of phage enzymes emerged, with their members very divergent from each another and from other classes of polymerases. Of these, PyroPhage 3173 became the most extensively studied enzyme, and represents the first thermostable phage DNA polymerase.
A unique combination of characteristics, which includes the ability to amplify GC-rich templates, strong proofreading and superior replication fidelity, reverse transcription and remarkable strand-replacement activity, make PyroPhage 3173 a promising tool for single-cell genomics.
Filling in the Gap
“There is still a gap between the microorganisms present in the environment and the ones that we can cultivate,” points out Martin Keller, Ph.D., director of the Oak Ridge National Laboratory’s biosciences division. “How can we fill this gap and use cutting-edge sequencing technologies to help us understand what specific microorganisms are doing in the environment? This is where single-cell microbiology comes into play.”
Recently, Dr. Keller and collaborators combined fluorescence in situ hybridization and flow cytometry with whole-genome amplification and sequencing. They illustrated the possibility of obtaining a significant fraction of an uncultured bacterial genome starting with several cells selectively isolated from a heterogeneous environmental sample. For a representative of the TM7 phylum that is less than 2% abundant in the soil, the investigators achieved approximately 20% genome coverage.
Next Level of Biofuels
An important effort in Dr. Keller’s group is exploring the plant-microbe interface and focuses on specific groups of microorganisms associated with the degradation of cellulosic material and carbon sequestration. Photosynthesis uses solar energy to generate plant cellulosic material, and cellulose decomposition by biomass-degrading microorganisms generates sugars that are subsequently fermented into alcohol.
“If we understand the connection between microbes and plants, we can really make a significant impact and help solve some of our major issues in carbon and energy,” adds Dr. Keller. “This is where I strongly believe that having these new tools—microbial ecology and single-cell genomics—and all the genomics tools together, can help us understand a lot about carbon fixation and about how we can go to the next level of biofuels.”
The world’s oceans harbor some of the most diverse microbial populations. Ramunas Stepanauskas, Ph.D., senior scientist at the Bigelow Laboratory for Ocean Sciences, and collaborators, recently combined fluorescence activated cell sorting with whole-genome amplification on coastal water samples to isolate and sequence two uncultured flavobacteria from the Gulf of Maine, with estimated 91% and 78% genome recoveries.
“Recent large-scale metagenomic sequencing unveiled the enormous richness of genes encoded in environmental microbial assemblages,” explains Dr. Stepanauskas, “However, it is very difficult, if not impossible, to assemble discrete genomes and to understand entire biochemical pathways through metagenomics alone.”
The study illustrated the power of these technologies in obtaining reference genomes, and represents a major advance because cultivation-based studies so far mostly recovered “weeds” with no ecological significance in the natural environments. The annotation of the two genomes unveiled several metabolic pathways with potential in bioenergy production such as biopolymer degradation and proteorhodopsin photometabolism.
“I have tremendous expectations for single-cell genomics,” says Tanja Woyke, Ph.D., research scientist at the Lawrence Berkeley National Laboratory/DOE Joint Genome Institute and first author of the study. “Together with other technologies aimed at accessing uncultured microbes, I think it will become one of the dominant methods that will benefit microbial ecology as well as other fields such as bioprospecting.”
An intriguing question in microbiology revolves around bacteria that can adopt a dormant state and remain metabolically inactive for extended periods. One medically relevant example, Mycobacterium tuberculosis, can persist in a dormant state in tissues and regain virulence decades later by mechanisms incompletely elucidated.
Single-cell approaches may allow metabolically quiescent cells to be compared to their metabolically active counterparts and unveil genomic features that could explain the mechanisms of dormancy.
A large portion of marine microorganims, just like those populating the soil or the human body, are dormant and refractory to classical culturing methods. It is still not understood how they survive in that state without being quickly eaten up by grazers such as protists.
Dr. Stepanauskas and his colleagues are using single-cell approaches to compare bacteria isolated from the food vacuoles of bacterivorous protists to metabolically active and dormant members of the bacterioplankton, to better understand how bacterial metabolic activity correlates with the probability of being grazed, a concept with important implications for the plankton genetic diversity.
“I think that single-cell genomics provides a huge opportunity to better understand what makes microbes dormant, what enables them to survive in a dormant state, and what triggers them to become active again. This is a basic question in microbiology, with implications anywhere from marine ecology to human diseases,” explains Dr. Stepanauskas.
Investigators at the Bigelow Laboratory for Ocean Sciences established a microbial single-cell genomics core facility that recently started processing the first samples and will offer increased throughput and reduced per-sample costs. In its full capacity, which will be reached in 2010, the facility anticipates performing cell sorting, whole genome amplification, and PCR-based screening on tens of thousands of individual cells weekly.
As the importance of a few selected cells is increasingly recognized as driving tumor development, metastatic dissemination, and resistance to therapy, single-cell genomics emerges as a promising tool to understand tumor heterogeneity.
“Obtaining the whole transcriptome from single cells is invaluable, because every cancer cell is different, every cell is at a different developmental stage, and to truly understand carcinogenesis you really need to do it at a single-cell level,” states Kaiqin Lao, Ph.D., principal scientist in the molecular cell biology division at Applied Biosystems, a division of Life Technologies.
In collaboration with a group from the University of Cambridge, Dr. Lao used Applied Biosystems’ SOLiD system to perform digital expression profiling of a single mouse blastomere and identified 75% more genes than by microarray methodologies.
This approach unveiled 1,753 previously unknown splice junctions and, for the first time, unambiguously confirmed that 8–19% of the genes with multiple known transcript isoforms, expressed at least two isoforms within the same blastomere or oocyte. This finding will provide insight into biological processes relying on specific isoforms and facilitate their selective therapeutic manipulation.
“This is really intriguing and important,” states Dr. Lao. “With conventional transcriptome assays it was not possible to ascertain if the multiple transcript isoforms found in a tissue or organ coexist in the same cell, or if they are just expressed in different cell types or at different cell cycle stages of the same cell type. This knowledge is crucial to provide a real understanding of the transcriptome complexity within individual cells. Such single-cell technology would be very important to gain a greater understanding of the behavior cancer stem cells and solid tumors, such as breast cancer, exhibit.”
Measurements relying on genetic material extracted from entire organs or tissues do not reveal the heterogeneity of genomic information. “The ability to truly perform single-cell nucleic acid measurements, which retain the true, unbiased information coming from cells, is the gold mine in genome biology,” believes Patrice Milos, Ph.D., vp and CSO at Helicos BioSciences.
A recent innovative approach developed at Helicos relies on sequencing RNA from small numbers of cells by capturing the polyA tails of the cellular transcripts through hybridizing to an oligo-dT surface. The RNA-DNA hybrids formed as a consequence of this strong and stable interaction are then subjected to sequencing-by-synthesis by using Helicos’ True Single Molecule Sequencing (tSMS)™ technology. This approach does not require ligation or PCR amplification steps and thus offers an unbiased view of events at what will become single-cell nucleic acid levels, according to Dr. Milos.
Increasingly, new experimental tools become informative about discreet features that set individual cells apart and reveal their contribution to the population as a whole. In this context, Richard Feynman’s words, so eloquently illustrating the need to explore individual components that shape entire systems, become very relevant: “Nature uses only the longest threads to weave her patterns, so each small piece of her fabric reveals the organization of the entire tapestry.”