A collaborative study has developed a software called gutSMASH that identifies primary metabolic gene clusters (MGCs) in specific microbial taxonomic groups that are important in host–microbiome interactions. Although built to predict MGCs from anaerobic human gut bacteria, the software can also be applied to identify metabolic pathways in microbial communities occupying other niches in the body.
The study was published in the journal Nature Biotechnology last week: “gutSMASH predicts specialized primary metabolic pathways from the human gut microbiota.” The Python-based software gutSMASH was built from the source code for an earlier algorithm called antiSMASH, which predicts biosynthetic gene clusters by detecting physically clustered protein domains using profile hidden Markov models and is freely downloadable from here.
The study was led by Michael Fischbach, PhD, an associate professor in the departments of bioengineering, and microbiology and immunology at Stanford University, Dylan Dodd, PhD, an assistant professor of pathology, and microbiology and immunology at Stanford University, and Marnix Medema, PhD, an assistant professor of computational biology at the University of Leiden in The Netherlands.

“We tailored this gene cluster detection framework to detect MGCs involved in primary metabolism and bioenergetics,” the authors noted. “gutSMASH can be a valuable tool in the field of enzyme/pathway discovery, to link metabolites to gene clusters and to identify genes responsible for microbiome-associated phenotypes.”
Members of our gut microbiome synthesize many small molecules or metabolites that alter a range of physiological attributes, from digestion to mood. Components of biosynthetic pathways that produce such metabolites are physically clustered in regions of microbial genomes intuitively called metabolic gene clusters. Algorithms that predict microbial metabolic pathways focus on MGCs for primary metabolism—pathways involved in growth and maintenance of cellular activity.
The researchers used the algorithm to systematically profile gut microbiome metabolism and identified 19,890 gene clusters in 4,240 high-quality microbial genomes. “We found marked differences in pathway distribution among phyla, reflecting distinct strategies for energy capture,” the authors noted.
The results explain how different bacterial families using unique enzymatic pathways to produce short-chain fatty acids and indicate that each microbial group (taxon) occupies a characteristic metabolic niche.
To determine the prevalence and abundance of each metabolic pathway in humans, the investigators analyzed 1,135 individuals from a Dutch cohort. Surprisingly, they found that the level of microbiome-derived metabolites in plasma and feces in these human samples did not reflect the abundance of the genes of the metabolic pathways that produce these metabolites in gut microbial metagenomes. This indicates a crucial role for pathway-specific gene regulation and metabolite flux.
Identification of bacterial pathways has come to depend on software that identify physically clustered genes. Analyses based on conserved gene clusters prevents false-positive errors that result from analyses based on similarities of sequences alone. The principle is widely applied in natural product biosynthesis.
“This work is a starting point for understanding differences in how bacterial taxa contribute to the chemistry of the microbiome,” the authors noted.