A new technique, DeepBAR, quickly calculates the binding affinities between drug candidates and their targets. The approach, which combines chemistry and machine learning, could lower a hurdle in the drug discovery process. DeepBAR yields precise calculations in a fraction of the time compared to previous state-of-the-art methods. The inventors say DeepBAR could one day quicken the pace of drug discovery and protein engineering.

“Our method is orders of magnitude faster than before, meaning we can have drug discovery that is both efficient and reliable,” said Bin Zhang, PhD, professor in chemistry at MIT, an associate member of the Broad Institute of MIT and Harvard, and co-author of a new paper describing the technique.

The research is published in Journal of Physical Chemistry Letters in the article, “DeepBAR: A Fast and Exact Method for Binding Free Energy Computation.

The affinity between a drug molecule and a target protein is measured by a quantity called the binding free energy—the smaller the number, the stickier the bind. “A lower binding free energy means the drug can better compete against other molecules,” said Zhang, “meaning it can more effectively disrupt the protein’s normal function.” Calculating the binding free energy of a drug candidate provides an indicator of a drug’s potential effectiveness. But existing methodologies struggle at balancing accuracy and efficiency.

Methods for computing binding free energy fall into two broad categories, each with its own drawbacks. One category calculates the quantity exactly, eating up significant time and computer resources. The second category is less computationally expensive, but it yields only an approximation of the binding free energy. Zhang, together with a postdoc in the lab, Xinqiang Ding, PhD, devised an approach to get the best of both worlds.

Deep generative models and the Bennett acceptance ratio method (DeepBAR) computes binding free energy exactly, but it requires just a fraction of the calculations demanded by previous methods. The new technique combines traditional chemistry calculations with recent advances in machine learning.

The “BAR” in DeepBAR stands for “Bennett acceptance ratio,” a decades-old algorithm used in exact calculations of binding free energy. Using the Bennet acceptance ratio typically requires a knowledge of two “endpoint” states (e.g., a drug molecule bound to a protein and a drug molecule completely dissociated from a protein), plus knowledge of many intermediate states (e.g., varying levels of partial binding), all of which bog down calculation speed.

The authors noted that, compared to the rigorous potential of mean force (PMF) approach that requires sampling from intermediate states, “DeepBAR is an order-of-magnitude more efficient as demonstrated in a series of host–guest systems.”

DeepBAR slashes those in-between states by deploying the Bennett acceptance ratio in machine-learning frameworks called deep generative models. “These models create a reference state for each endpoint, the bound state and the unbound state,” said Zhang. These two reference states are similar enough that the Bennett acceptance ratio can be used directly, without all the costly intermediate steps.

In using deep generative models, the researchers were borrowing from the field of computer vision. “It’s basically the same model that people use to do computer image synthesis,” said Zhang. “We’re sort of treating each molecular structure as an image, which the model can learn. So, this project is building on the effort of the machine learning community.”

While adapting a computer vision approach to chemistry was DeepBAR’s key innovation, the crossover also raised some challenges. “These models were originally developed for 2D images,” said Ding. “But here we have proteins and molecules—it’s really a 3D structure. So, adapting those methods in our case was the biggest technical challenge we had to overcome.”

In tests using small protein-like molecules, DeepBAR calculated binding free energy nearly 50 times faster than previous methods. Zhang said that efficiency means “we can really start to think about using this to do drug screening, in particular in the context of COVID-19. DeepBAR has the exact same accuracy as the gold standard, but it’s much faster.” The researchers added that, in addition to drug screening, DeepBAR could aid protein design and engineering, since the method could be used to model interactions between multiple proteins.

DeepBAR is “a really nice computational work” with a few hurdles to clear before it can be used in real-world drug discovery, said Michael Gilson, MD, PhD, professor of pharmaceutical sciences at the University of California, San Diego, who was not involved in the research. He said DeepBAR would need to be validated against complex experimental data. “That will certainly pose added challenges, and it may require adding in further approximations.”

In the future, the researchers plan to improve DeepBAR’s ability to run calculations for large proteins, a task made feasible by recent advances in computer science. “This research is an example of combining traditional computational chemistry methods, developed over decades, with the latest developments in machine learning,” said Ding. “So, we achieved something that would have been impossible before now.”