Building a Focused Library

Better Design Is One Way to Increase the Chances of a Successful Screening Hit

The better your library is designed, the better your chances of a successful screening hit. Given knowledge of the protein target, a focused approach to library design leads to far higher hit rates than high-throughput or random screening.

The forgeV10 computational suite from Cresset BioMolecular Discovery uses field-based models to quantify the biological activity of molecules. This knowledge enables users to build a focused screening library with novel and diverse chemical structures while keeping the range of activity focused to give the maximum chance of success against a defined target.

Small molecule drugs are recognized by and bind to proteins on the basis of their 3D electronic and shape properties, not their 2D chemical structure, yet the drug discovery cycle has traditionally sought to predict biological activity based on 2D structure.

Cresset’s force field technology is based on the XED force field, developed by Andy Vinter while working at the University of Cambridge, U.K. The XED force field uses off-atom charges on electronegative atoms to result in an accurate representation of the charge density surrounding an atom. The fields are expressed as field points around the chemical structure (Figure 1).

Figure 1. The 2D structures (left) and field patterns (right) of bioisosteres both active at PDE3: cAMP (the natural substrate) and SKF93741, a PDE3 inhibitor. The field patterns reveal that these two structurally diverse molecules are biologically similar.

Field point descriptions of molecules close the gap between chemistry and biology, giving a “protein’s eye” view of compounds. Using fields, structurally diverse yet biologically similar molecules appear identical.

Computationally generated field patterns open the possibility of analyzing and searching for compounds on the basis of activity rather than structure. This leads to the rapid identification of novel structures from diverse chemical series that are likely to show similar biological activity.

When designing a focused library, the goal is to retain the important features known to be associated with activity while at the same time exploring the maximum amount of chemical space. Producing a field template to define the desired biological activity makes it possible to optimize the structural diversity of the library while retaining the focused activity.

Knowledge of the Target

Knowledge of the target is the most important starting point for designing a focused screening library, and it’s important to assess how much information is available about the target before deciding how to proceed. With an uncharacterized target, random screening is the first approach. But the more users know about the target, the better chance they have of finding new compounds to hit the target, and the more focused the library can be.

For example, when designing a library for H3 antagonists the target is well characterized. There are a number of ligands for it and there are also drugs on the market that hit the receptors. With this knowledge it is possible to build a field template that will lead to a focused screening library with a high chance of success.

Field templates, or pharmacophores, are used in library design to predict the activity of compounds at therapeutic targets. They can be compared to the biological fingerprint for a protein binding site.

The first step in building a field template is to analyze active ligands that interact with the target to find a common shape for binding. Where the 3D shape of the protein active site is not known, Cresset’s forgeV10 computational suite is used to compare the conformations of the ligands to find their optimum alignment in the binding site of the protein. This alignment, or an alignment generated from protein-ligand crystal data, together with structure activity data is used to find the field points that are likely to correspond to important features in the active site.

To illustrate this point, forgeV10 was used to build a library of potential H3 antagonists. A series of seven highly active H3 antagonists were identified from the literature and aligned in their bioactive conformations to generate a consensus field template (Figure 2).

Figure 2. A series of seven highly active H3 antagonists were identified from the literature and aligned in their bioactive conformations to generate a consensus field template.

As confirmation of the predictive capability of this template, the field match score was compared against the known activity (Ki) scores of 68 further H3 antagonists described in the scientific literature and outside the original training set. A good match of fields to activity was confirmed.

The H3 template was then used to screen Cresset’s compound collection to identify potential H3 antagonists. A large number of matches were identified, with 68 distinct chemical scaffolds.

This example demonstrates how forgeV10 can be used to search new areas of chemical space for new candidates. The field analyses take users beyond the limitations of chemical structure, to find compounds with similar activity but varying chemotypes, leading to new starting points for research.

Field templates can also be built for toxicity targets as well as for therapeutic targets. A range of such templates can be derived and used as filters to counterscreen a library of compounds.

In the H3 example, the compounds were screened against field templates for CYP 2D6 and hERG. Approximately 4% of the compounds were rejected due to potential 2D6 toxicity and a further 8% due to potential hERG toxicity.

Choosing Novel Scaffolds

We have seen how to build a focused library from existing compounds by searching a database to find new structures that are likely to be active against the target. However, libraries are also used to explore the chemical space around a hit, and forgeV10 is very effective in predicting novel bioisosteric compounds that will exhibit the same activity when key fragments of their structure are replaced.

forgeV10 was used to replace the central core as an alternative library method in order to generate a novel scaffold replacement library. The results of this analysis can be seen in Figure 3.

Figure 3. A plot of field similarity against structural similarity for the novel scaffold replacement library. Five of the more interesting compounds, all of which are novel and have high similarity scores, have been highlighted in yellow.

The highlighted structures on the graph represent some of the most active known H3 antagonists from the literature, and the blue structures represent novel compounds generated by forgeV10. The graph shows a number of novel compounds with diverse central cores that have significantly higher predicted activities at H3, as shown by the higher field similarity score.

These highlighted compounds would be ideal candidates for inclusion in the final library as they combine innovation with chemical tractability and high predicted activity. Interestingly, the 2D similarity score of most of the dataset, including all of the highlighted molecules, is less than 0.7, which is a de facto cut-off for 2D-based scoring methods. This means that most of these structures would be very unlikely to be considered in a traditional library design process as there would be no reliable way to predict their activity.


Drug discovery is an exercise in multi-parameter optimization. Cresset’s XED force field algorithms enable users to accurately quantify electrostatic field similarity, which relates directly to one of the most important parameters—the biological activity of compounds.

forgeV10 is a comprehensive software suite that uses Cresset’s XED force field to predict the conformation and activity of ligands. It can be used to build field templates that give a biological fingerprint for a protein target. This template can be used to screen compound collections or fragment libraries to help to build focused screening libraries with a high chance of success.


Martin Slater ([email protected]) is director of consulting and Katriona Scoffin is scientific writer at Cresset BioMolecular Discovery.