Doing without reference genomes when analyzing metagenomic samples may not be as daring as it sounds. After all, reference genomes typically represent just a fraction of the species and viruses that populate the gut and other microbial environments. Moreover, even identified species show considerable genetic heterogeneity. So, if reference genomes would leave us raising our hands in surrender, what other means could have us sifting through DNA sequence data, identifying which scraps belong to individual organisms and viruses, including organisms and viruses not previously identified?
According to researchers affiliated with the Metagenomics of the Human Intestinal Tract (MetaHIT), a promising new approach to analyzing DNA sequence data involves something they call the co-abundance principle. Basically, this principle assumes that different pieces of DNA from the same organism will occur in the same amount in a sample, and that this amount will vary over a series of samples.
The researchers described their approach online in the June 6 issue of Nature Biotechnology (“Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes”). They report how using the co-abundance principle let them map 500 previously unknown microorganisms in human intestinal flora as well as 800 also unknown bacteriophages.
"Using our method, researchers are now able to identify and collect genomes from previously unknown microorganisms in even highly complex microbial societies. This provides us with an overview we have not enjoyed previously," said Professor Søren Brunak, a researcher at the Technical University of Denmark (DTU). Prof. Brunak co-headed the study together with DTU associate professor Henrik Bjørn Nielsen.
“Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses, and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences,” the study’s authors wrote. “We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS).”
The researchers report that they have assembled 238 high-quality microbial genomes. In addition, they identified affiliations between MGS and hundreds of viruses or genetic entities. “Our study tells us which bacterial viruses attack which bacteria, something which has a noticeable effect on whether the attacked bacteria will survive in the intestinal system in the long term,” added Nielsen.
Previously, bacteria were studied individually in the laboratory, but researchers are becoming increasingly aware that in order to understand the intestinal flora, one needs to look at the interaction between the many different bacteria found. When the interactions are understood, it may be possible to develop a more selective way to treat a number of diseases.
“Ideally we will be able to add or remove specific bacteria in the intestinal system and in this way induce a healthier intestinal flora,” explained Søren Brunak.
The researchers' work could improve understanding and treatment of a number of diseases such as type 2 diabetes, asthma, and obesity. It is also interesting in relation to the increasing problem of antimicrobial resistance, which many consider a real threat to global health.
“We have previously been experimenting with using bacteria and viruses to fight disease, but this was shelved because antimicrobial agents have been so effective in combating many infectious diseases. If we can learn more about who attacks who, then bacterial viruses could be a viable alternative to antimicrobial agents. It is therefore extremely important that we now can identify and describe far more relations between bacteria and the viruses that attack them,” concluded Nielsen.