February 15, 2012 (Vol. 32, No. 4)

Combining Chemists’ Expertise and Software Capabilities to Explore Different Paths with StarDrop

One of the defining challenges of drug discovery is the need to make complex decisions regarding the design and selection of potential drug molecules based on a relative scarcity of experimental data. Synthesizing compounds and generating experimental data, even using modern high-throughput methods, is time-consuming and expensive.

Therefore, the opportunity to explore new compound ideas has, until recently, been limited, leading to a focus on the iterative exploration of a relatively small number of closely related compounds. One risk of this is that opportunities to identify high-quality compounds may be missed, as the tendency to quickly focus on a relatively small range of chemical diversity prevents a broad search of the chemical possibilities.

With the use of in silico predictive methods it is possible to consider a much larger range of ideas before choosing those on which to focus intellectual, synthetic, and experimental efforts.

In this new scenario, the limitation becomes the time and experience necessary to generate a wide diversity of compound ideas and manually enter these into a computer. However, the recent development of computational methods to automatically generate new, chemically relevant compound ideas can dramatically increase the range of ideas that can be considered.

One approach to generating new compound ideas is to apply medicinal chemistry “transformation rules” to one or more initial compounds to create related structures. A transformation rule is a structural modification that might typically be considered by a medicinal chemist in the optimization of a compound. Transformations do not necessarily correspond to specific synthetic routes or reactions, but represent relatively tractable steps in chemical space.

Transformations may be derived from the medicinal chemistry literature or medicinal chemists’ personal experiences and may include simple substitutions, functional group replacements, or larger changes such as modification, addition, or removal of ring systems.

This concept was originally introduced in the Drug Guru platform developed at Abbott and also employed in Pareto Ligand Designer and Optibrium’s StarDrop™.

A suite of software for guiding decisions in drug discovery using predictive models and multiparameter optimization, StarDrop includes a plug-in module called Nova™ that specifically enables medicinal and computational chemists to automatically generate new compound ideas by applying transformations to their molecules in order to improve their properties.

The advantage of this approach is that, because the transformations are based on structural modifications with precedence in medicinal chemistry, the resulting structures are more likely to be relevant. There is little value in proposing compound structures that are unstable, infeasible, or chemically nonsensical. This was one of the reasons for the limited success of early approaches to computational generation of new compound structures.

In practice, it is difficult to reduce the proportion of unacceptable compounds generated below 5% without losing the generality of the transformations, reducing the range of ideas that can be explored. Furthermore, although a small number of poor structures may be a minor distraction, they may also stimulate ideas for similar compounds that are chemically feasible.

These transformations can be applied iteratively to generate multiple generations of compounds, as illustrated in Figure 1. This permits a very large number of chemically relevant new ideas to be easily generated, but leads to a new problem; how to assess these ideas to identify those most likely to be of interest to a drug discovery project?

Figure 1. Compound ideas can be generated by iteratively applying medicinal chemistry transformations to an initial input structure to create multiple generations of ideas.

What Makes a Good Idea?

A successful drug must possess a delicate balance of many properties in order to be efficacious and safe, including potency against its therapeutic target, appropriate physicochemical and absorption, distribution, metabolism, and elimination (ADME) properties, and a lack of off-target effects and nonspecific toxicity. The simultaneous optimization of many of these properties is commonly described as multiparameter optimization (MPO), and many approaches have been developed to facilitate this in drug discovery.

A predictive model of a single property can be used to prioritize compound ideas for one objective, while MPO can be used to simultaneously consider many criteria that a drug discovery project must optimize. However, the greatest benefit can be gained by coupling algorithms for idea generation with MPO, to generate new ideas that have the best balance of properties for a drug discovery project’s therapeutic objectives. This ensures that efforts are focused on those chemistries with the highest chance of downstream success.

Pareto Ligand Designer takes the approach of combining a transformation-based approach for idea generation with a Pareto optimization algorithm. Pareto optimization does not select compounds based on a single ideal profile, but explores a range of solutions each with a different, optimal balance of properties (Figure 2).

This approach is most useful when the best combination of properties is not known a priori and therefore it is advantageous to explore a range of different property profiles.

Figure 2. The goal of Pareto optimization is a compound with both high solubility and potency (represented by the yellow star). While this ideal is not achievable, the red points represent Pareto optimal points, i.e., for each of these compounds there are no compounds with both better potency and solubility (for example the point labeled A). Conversely, the point B is not Pareto optimal because there is a point that has both higher potency and solubility.

An alternative approach, employed by StarDrop, is to prioritize compounds with the best balance of properties relative to an ideal profile, as defined by a drug discovery project team (an example is shown in Figure 3). Examples of such methods include the calculation of a score for each compound considering the desirability of its property values or a probabilistic scoring approach that prioritizes compounds with the highest chance of success against the required profile.

This conjures a utopian vision (at least for some!) of a computer that automatically generates and explores new compound ideas before selecting a small number for synthesis and testing, confident in the knowledge that these will yield a high-quality drug. This would dramatically reduce the time and effort required for drug discovery, and increase the chance of finding a high-quality candidate drug by exploring a very large diversity of possibilities.

Alas, the realization of such a vision remains remote in practice. This is because the accuracy of predictive methods is insufficient to identify a compound and say with confidence that it will achieve the requirements for a high-quality outcome. The accuracies of predictive models range from a factor of two to ten in prediction of binding affinities or other biological properties. Therefore, while models can be very useful in discarding poor compounds, they can only identify those compounds most likely to meet the ideal criteria. A computer cannot yet design a perfect compound.

Furthermore, other requirements of a good compound, such as synthetic tractability and novelty from an IP perspective, remain subjective factors that rely on the expertise and creativity of a medicinal chemist. Therefore, the ideal approach combines a chemist’s expertise with the capabilities of a computer. The computer can help to explore a wide range of possibilities and prioritize those most likely to be of interest for detailed consideration, while the expert will make the final decision on the strategy to be adopted.

Figure 3. An example of a scoring profile that defines the property criteria for an ideal compound and the importance of each individual criterion to the overall objective of a project. Because it is often impossible to find an ideal compound that meets all of the criteria, the importance values define the acceptable compromises. Underlying each of the criteria is a desirability function that can be used to make more subtle distinctions between property values. An example is shown for blood brain barrier penetration (the log of the ratio between concentrations in brain and blood); the ideal range is between -0.2 and 1 and a compound with a value below this range is worse than a compound with too high a value.

Matthew Segall ([email protected]) is CEO at Optibrium. Web: www.optibrium.com.

Previous articleFDA, CDC Budget Inches Up under President Obama’s Proposal for 2013
Next articleOrganovo Raises $6.5M to Advance 3-D Tissue Printing Platform