One step forward, two steps back. Doesn't that sound familiar? So often, when pursuing a complex objective, we fall into the all-too-human trap of rushing headlong into the path of least resistance. Well, we shouldn't be too hard on ourselves. In fact, this is an edict of physics; the prime law of classical mechanics is the principle of least action. People seldom deviate from this maxim.
A good engineer, however, will remind us soberly that the only experiments that can be interpreted are experiments that were designed to test a hypothesis. It is a profound statistical truism that is far too often overlooked in applied science.
More typically, the data that can be generated soonest becomes the primary axis of optimization. Seldom, of course, is this direction aligned with the end-goal.
Drug Discovery's Least Action Principle
In drug discovery, the path of least resistance, the most available data, or the primary axis of optimization is too often a biochemical or cell-based measure of potency.
Multivariate statistics teaches us that any complex optimization problem has certain latent variables, combinations of factors that, taken together, lead most directly to the stated goal. In drug discovery, for a specific target, this principal component is an admixture of potency and other factors, such as pharmacokinetics, safety, and selectivity.
A common problem, however, is that these additional data are available at a different pace from the potency data. Hence, despite the prevailing wisdom, many lead optimization projects fall prey to the principal of least action, and chemists design their projects into a corner: high-potency and low bioavailability.
The problem is a result of the time-scale of design-synthesize-test cycles in iterative lead optimization. Design takes a few hours of cerebral activity, synthesis takes roughly about a week in the lab, and the time-scale of testing depends on the type of data-days for potency, weeks for ADME, and even longer for Tox.
So, the chemist is faced with a decision. Either wait weeks to collect all the data before another design-synthesize-test cycle, which is unthinkable, or wait days for the potency data, and start another design cycle, vainly promising to rethink the path once the ADME data arrives.
Human nature prevails in these situations on two fronts: the pragmatic, fast-action path is chosen, and the promise to revisit the design in view of ADME data becomes chronically deferred. This is the path of least resistance.
Because of the generally anti-correlated relationship between potency and ADME/Tox, it is not uncommon for project teams to snooker themselves, developing high-potency, nonbioavailable lead compounds where any change that might improve the ADME characteristics comes at a large sacrifice in potency.
But it doesn't need to be that way. With the onset of biochemical and cell-based screens for ADME characteristics, there is the potential to deliver these results within the decision time-frame of the design-synthesize-test cycle. The key requirement is no longer scientific, but implementation: It is no longer the ability to generate ADME/Tox data, but the ability to generate ADME/Tox data in time to facilitate critical decisions.
Haste Makes Waste - Putting Quality First
There is, however, a strong caveat to this that has a lot to do with another maxim of human behavior in the post Microsoft Office era. Once a number appears, it is cut and pasted into Excel spreadsheets for analysis and decision making, into Powerpoint presentations for discussion and group-think, into Word documents for reports and publications, and into Outlook emails for broad dissemination.
So, data that appears once can very easily become separated from its context, documentized and broadly distributed until it becomes part of the organizational urban legend. I call this the database-document-dogma paradigm.
Undoing the effect of dogmaticized errant data is far more difficult than avoiding it in the first place.
What does this have to do with ADME? Well, later-stage ADME/Tox data is carefully created, providing a high-degree of analytical acuity and controlling for the potential artifacts due to low-solubility and high-non-specific binding. The earlier data, on the other hand, has been generated with reasonable analytical aquity and no explicit regard for these physicochemical artifacts.
The prevailing view is that the economics of the situation, i.e., the number of samples, requires scaled down assays to support throughput goals and as previously argued, turn-around time goals.
The potential harm here comes from the database-document-dogma paradigm. It runs like this: A project team receives a number of lead compounds from the hit to lead group, has them all profiled and decides to pursue a lead compound series where the metabolic stability looks pretty high, completely unaware that the results were really reflective of an apparent stability without specific information about the unbound fraction.
Furthermore, the CYP inhibition profile appeared clean, but the solubility data, on further inspection, show a solubility that is significantly lower than the analyte concentration in the early CYP assay. The project continues to optimize potency, adding lipophilicity, and spot-checks the ADME properties.
Excitement builds and the team nominates some compounds for Tier II ADME profiling, where the experiments give a much clearer insight and control for solubility and non-specific binding. The results are shocking to the project team. CYP 3A4 inhibition is high, solubility is low, plasma protein binding is high, and intrinsic clearance is high. Essentially the entire project has ground to a halt because its ADME properties are awry.
Meanwhile, there was a second chemical class presented by lead discovery, which showed moderate metabolic stability and moderate CYP inhibition, liabilities that could be overcome with appropriate synthetic steps.
What was not appreciated at the time of selection, however, was that this series had much higher solubility and much lower non-specific binding, so that the real liabilities were actually less than the series that was chosen.
In our example, the project team went back to this second class, put them directly into the Tier II assays, saw the relative advantage of these experiments, and made progress toward the clinic.
There were two costs to the drug discovery organization, however, about six months of lost time as well as a significant increase in demand for the Tier II experiments, and these were almost never generated within the timeframe of the potency data.
There is an alternative to running every compound through the expensive Tier II assays: make sure that solubility and non-specific binding experiments, for example plasma protein binding experiments, are run first, and use these results to queue the compounds to specially designed variants of the CYP and metabolic stability assays.
This way every number that is generated is free, to a large extent, of the sort of bias these physicochemical properties can introduce.
Of course, this puts even more logistical pressure on the ADME screening workflow and makes it even more challenging to deliver the results in time. However, there are examples of emerging technology that provide end-to-end workflow support, streamlining all the logistics between request and result as well as providing the built-in capability to support hierarchical screening.
Put to good use, a system like this can help deliver unequivocal data in time to make critical research decisions, thereby avoiding two very common weaknesses of human nature: the principal of least action and database-document-dogma.