Researchers participating in the Enzyme Function Initiative (EFI) have used a computationally rich metabolite-docking strategy to sift through a protein’s potential metabolic partners and predict its function. Beginning with more than 87,000 possible metabolic partners, the researchers narrowed their search to just a few structures that seemed promising fits. These few molecules were then subjected to a battery of laboratory tests, which ultimately determined the protein’s role in its host, the marine bacterium Pelagibaca bermudensis.
The protein, HpdD, was initially identified via genomic information, but its function remained unknown. And so the technique used to determine its function may prove useful in contexts well beyond bacteria. “The goal was not simply to identify the protein’s function but to forge a new way to tackle the vast and growing body of sequence data for which functional information is lacking,” said University of Illinois professor John Gerlt, one of five co-principal investigators of the EFI team’s study. “At present, the number of proteins in the protein-sequence database is approaching 42 million. But not more than 50% of these proteins have reliable functions assigned to them.”
Matthew Jacobson and postdoctoral researcher Suwen Zhao at the University of California, San Francisco led the computational effort that was at the heart of streamlining the process of protein discovery for the group. Their method pairs an enzyme with tens of thousands of possible metabolic partners to see which molecules fit together best.
This process led to the identification of four possible substrates. The identities of these four substrates—and a likely pathway in which the enzyme operated—were passed along to other members of the EFI team, who began the painstaking laboratory work needed to identify the actual substrate.
The details of this work are described in EFI team’s paper, “Discovery of new enzymes and metabolic pathways by using structure and genome context,” which appeared online in Nature on September 20. According to the paper, “The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway … was established by transcriptomics.”
The researchers discovered that their enzyme catalyzes the first step in a biochemical pathway that enables the marine bacterium to consume one of the substrates identified in Jacobson’s lab. The bacterium uses the substrate, known as tHypB, as a carbon source. (It also helps the organism deal with the stress of life in a salty environment.)
This effort to understand the function of one enzyme offers a cascade of other benefits, Gerlt said. One big advantage of this approach is that it aids in the identification of orthologs (enzymes that perform the same task in other organisms).
“There are dozens of orthologs in the protein database that were identified, so we determined not only the function of one but we also determined the functions of all these enzymes,” Gerlt said. And because the researchers also identified the functions of all the enzymes in the pathway that allows the microbe to consume tHypB, their work offers insight into the role of orthologous enzymes in similar pathways in other organisms.
Researchers with the EFI are working to develop strategies and tools that other researchers can use to accomplish similar feats of discovery. “There was a time when a researcher devoted his or her entire career to a single enzyme,” Gerlt said. “That was a long time ago, although some people still practice that. Now, genome-sequencing technology has changed the way that biologists have to look at problems. We can’t keep looking at problems in isolation.”