November 1, 2014 (Vol. 34, No. 19)
In 1977, when researchers published the genome of bacteriophage PhiX174, the first fully sequenced virus, about 1,000 base pairs could be sequenced within a year.
Researchers estimated that if they were to keep using the same sequencing technology, they would need a thousand years to complete the Escherichia coli genome, and a million years to complete the human genome.
Thanks to subsequent advances in sequencing technologies, the human genome—all 3 billion base pairs—was sequenced within 13 years. These advances opened the possibility of interrogating the link between genetic mutations and disease more precisely and systematically than ever before.
“We are interested in genetics and genomics of multiple sclerosis as a general theme,” says Sergio E.
Baranzini, Ph.D., professor of neurology at the University of California San Francisco School of Medicine. As part of the International Multiple Sclerosis Genetics Consortium, Dr. Baranzini and colleagues are focusing on analyzing and integrating datasets from patients with multiple sclerosis.
Research in Dr. Baranzini’s lab uses whole genome sequencing to build on the information that was generated by genome-wide association studies and involves the sequencing of families with high prevalence of the disease. “[This kind of sequencing] is followed by the sequencing of individuals with particular types of multiple sclerosis that are noteworthy by disease presentation, response to therapy, or other clinical criteria,” explains Dr. Baranzini.
These efforts are particularly important for unveiling new susceptibility loci, considering that even after finding that HLA and about 150 additional genetic loci are associated with multiple sclerosis, a large proportion of heritability is still not explained for this condition.
“We are facing two types of challenges that are interrelated, and these are cost and data analysis,” continues Dr. Baranzini. Even though sequencing has become much more cost-effective in recent years, it still entails the sometimes overlooked costs associated with data analysis, a process that may take several months.
Pathway Analysis
In a recent study, Dr. Baranzini and colleagues performed a pathway analysis on a large dataset generated by the International Multiple Sclerosis Genetics Consortium. The analysis was designed to identify groups of susceptibility variants rather than each susceptibility variant in isolation in every patient.
The need for this approach stems, in part, from the possibility that at the population level, different patient subsets harbor susceptibility variants in various genes from distinct pathways. The advantage of this approach is that it could enable researchers to stratify patients by pathway and characterize disease heterogeneity in the population, a defining feature of many medical conditions.
“It is likely that a given individual may harbor common as well as rare variants that are either pathogenic or at least have a biological effect,” notes Dr. Baranzini. “A pathway-based approach could help to identify pathways that are more affected by the burden of genetic variation.”
Dr. Baranzini’s lab, in keeping with its standing as a founding member of the Multiple Sclerosis Microbiome Consortium (together with investigative teams at Mount Sinai, the University of Colorado Boulder, and the California Institute of Technology), has undertaken an additional research effort. This investigation involves the sequencing of the microbiome in multiple sclerosis patients.
Recent studies linked changes in the human intestinal microbiome to various medical conditions including obesity, diabetes, atherosclerosis, and inflammatory bowel disease. In an international multiple sclerosis conference in early September, Dr. Baranzini presented recent preliminary data pointing to identifiable differences in the intestinal microbiome of multiple sclerosis patients as compared to healthy controls.
“We are initiating efforts to expand the study to much larger patient populations and use the microbiome as a window to the environment to complement what we know about the genetics and provide a more complete picture of multiple sclerosis,” declares Dr. Baranzini.
One of the fascinating aspects in biology has revolved around understanding somatic chromosomal aneuploidy. A high frequency of aneuploidy has been historically reported in several organs even in the absence of histopathological changes, particularly in the human brain and liver. As opposed to constitutional aneuploidy, in which entire chromosomes are missing from all the cells of an organism, somatic aneuploidy affects only selected cells within an organ.
While initially it was thought that somatic aneuploidy could shed light on interindividual variability and differences in the physiology of tissues and organs and possibly on pathological processes affecting them, understanding the functional relevance of this sort of anomaly has been more challenging.
Single-Cell Sequencing
For the most part of the last few decades, visualizing chromosomal aneuploidies has relied on cytological techniques, such as fluorescence in situ hybridization (FISH), in which the genome is probed for the presence of certain chromosomal regions.
“This method has certain limitations,” points out Angelika Amon, Ph.D., professor of biology at MIT. In the case of FISH, determining the number of chromosomes relies on counting specks under the microscope, an error-prone strategy.
“We decided to use single-cell sequencing to address the question of somatic aneuploidy in a very unbiased way,” says Dr. Amon. “We were able to gain an unprecedented insight into genomic heterogeneities within an organism.”
For the first time, Dr. Amon and colleagues provided the first genome-wide assessment of chromosomal copy number changes at the single-cell resolution. Probing for entire chromosomes and using tissues lacking any evidence of disease, this technique revealed that the brain and the liver have about 5% aneuploid cells, a prevalence that is lower than previously reported, and comparable to the level of aneuploidy that can be seen in skin cells.
“This is only a starting point,” notes Dr. Amon. “We would like to use single-cell sequencing to begin asking questions about cancer, aging, and neurodegenerative diseases to see whether and how genome composition is altered in these disease conditions.”
“We have used shotgun metagenomic sequencing to explore, in an unbiased manner, microbial communities from wound samples,” adds Jonathan E. Allen, Ph.D., bioinformatics scientist at the Lawrence Livermore National Laboratory.
Dr. Allen and colleagues are using a metagenome sequencing and bioinformatics approach to analyze microbial populations from combat wound infections. It contrasts with “gold standard” approaches to exploring the microbial organisms that colonize wound infections. These approaches typically rely on growing the bacteria on plates and in liquid cultures.
In a recent analysis of extremity injuries in U.S. combat members, Dr. Allen and colleagues unveiled a broad diversity of the microbial populations found in the wounds present in different individuals, and also showed that standard microbial culturing techniques are not able to capture the full diversity of the microbial populations.
“The real potential for advancement is to see all kinds of new microbes that are present in the wound but may not show up in culture-based diagnostics,” emphasizes Dr. Allen.
This strategy, in addition to unveiling unculturable microbes, also provides faster analysis and enables earlier diagnosis, which is often crucial. Current limitations in identifying microbes from wound infections, using this approach, are mostly related to sample preparation and the time required for sequencing and subsequent analyses. “That time is changing rapidly, but it is certainly shorter than the time it takes to culture microorganisms,” notes Dr. Allen.
Significant Challenge
While a major goal is to visualize many more microbes that are present in a sample, a significant challenge is capturing microorganisms that are present at low abundance but are still important for the clinical outcome.
“One of the goals is to be able to type these low-abundance organisms at the highest resolution and characterize them at the strain level, as opposed to placing them at the broad level of a family, and for that, the goal is to be able to get as much information as possible about what that isolate is,” explains Dr. Allen.
Intimately linked to this concept, a major effort by Dr. Allen and colleagues involves the identification of low levels of mutants that may be selected for in response to a change in the environment. These mutations may be rare at an early point in time, but they may become more prevalent as the environment changes.
“One of our functional hypotheses is that it is possible to see a mutation at a low level and ascribe some functional fitness to it, and then observe that low-level mutation emerge at a later time as a dominant variant in response to some change in the environment,” affirms Dr. Allen.
His team is developing a framework to monitor different environments and characterize microbial strains that may be circulating in the background, prior to the occurrence of events of concern.
“Tracking low-level variants early on is important, because we can observe the things that start out at low abundance and may be increasing in frequency over time,” Dr. Allen insists. “This allows us to measure the rate of increase in the population.”
When it comes to biological diversity, humans have a biased perspective based on charismatic macroscopic organisms, according to Steven Hallam, Ph.D., associate professor of microbiology and immunology at the University of British Columbia. Despite this skewed perspective, microscopic bacteria and archea represent the majority of life on Earth.
“[Microbial] diversity,” says Dr. Hallam, “is cryptic because most microorganisms remain to be cultivated.” Studies performed decades ago in hot springs found that compared to microbial organisms that can be grown in the lab, there were many more groups of microorganisms initially described as candidate divisions.
“These were known to exist on the basis of a single gene sequence mapped onto the tree of life,” he continues. Over time, more and more microbial diversity fell within these candidate divisions, and they become known as microbial dark matter because their metabolic potential was completely unknown.
“The majority of living things on the planet have eluded our consciousness for so long,” marvels Dr. Hallam. “They continue to resist clonal isolation.”
To explore the structure and function of microbial communities in natural and engineered ecosystems, Dr. Hallam and colleagues are using plurality approaches, which rely on sequencing DNA or RNA directly from microbial samples, and (more recently) single-cell genomic approaches, which allow individual genomic sequences to be generated after sorting single cells.
“If we envision the genomes from a microbial community as a collection of puzzles, we can use the plurality approach to generate a gigantic pile of puzzle pieces, and let the sequencer and downstream computational workflow sort everything out,” explains Dr. Hallam.
This strategy almost never recovers the complete sequence of any individual genome, but it allows incomplete genomic sequences of individual microbial genomes to be assembled. “The single-cell approach is powerful because we know that the sequences come from a specific puzzle,” asserts Dr. Hallam. “Combining that with the plurality approach allows us to effectively build composite or population genomes for uncultivated microorganisms.”
In a recent study harkening back to the early hot spring observations, Dr. Hallam and colleagues performed a survey of the microbial diversity in the stratified waters of meromictic Sakinaw Lake, and identified an unprecedented number of microbial dark matter groups. Network analysis combining this diversity information with single-cell genomic sequences identified potential metabolic interactions between microbial dark matter and methane producing microbes.
In a separate study focused on metabolic pathway reconstruction, Dr. Hallam’s group provided a computational workflow for predicting potential metabolic interactions between microorganisms. Such interactions expand the metabolic repertoire of individual cells by creating emergent pathways in which multiple microbial groups cooperate to complete a biochemical process.
“We are now poised to infer metabolic pathways within individual cells, populations, and whole microbial communities within an integrated software environment,” explains Dr. Hallam. This creates real opportunities to learn about how microbial communities interact and cooperate to solve metabolic problems.
“One of the fascinating things that we are starting to realize is that the genomes of many microbial dark matter groups assembled using plurality and single-cell genomic methods are tuned for metabolic interactions,” he continues. “They appear to have nutrient and energy dependencies that mandate a cooperative mode of existence, one that might explain their resistance to clonal isolation.”
Looking forward, it will be interesting to see how a new understanding of microbial interactions will inform the development of biotechnologies based on these cooperative design principles.