Andrew LeBeau
Andrew LeBeau, PhD,
Associate Vice President of Product Integrations, Dotmatics

The first-ever mRNA vaccines, namely, the Pfizer-BioNTech and Moderna mRNA vaccines for SARS-CoV-2, are welcome for several reasons. They give us hope that the COVID-19 pandemic will be brought under control. They also represent a technology that could be readily adapted to fight pathogens other than SARS-CoV-2. Finally, they are part of a larger trend toward RNA-based medical interventions. These interventions include RNA-based therapeutics. Potential RNA-based therapeutics include small interfering RNAs, antisense oligonucleotides, microRNAs, aptamers, and mRNAs. (And besides yielding prophylactic vaccines and therapeutics, RNA-based technologies may lead to therapeutic vaccines in immuno-oncology applications).

In drug discovery, RNA therapies occupy a middle ground between traditional small-molecule drugs and large-molecule biologics. For example, RNA therapeutics are intermediate in terms of their size and their ability to access sites of action. They are also intermediate in terms of their sensitivity to individual chemical modifications, such as those meant to confer improved pharmacokinetic properties. They are, in a sense, intermediate in terms of the balance they strike between being chemically designed and naturally derived from biological molecules.

A common language for chemists and biologists

In general, small-molecule drugs are within the purview of chemists, and large-molecule biologics are within the purview of biologists. But several drug modalities are emerging that require expertise from both chemists and biologists. These modalities include conjugates such as antibody-drug conjugates (ADCs) and peptide-drug conjugates (PDCs). Other such modalities include, yes, RNA therapeutics.

These modalities offer great promise for the development of lifesaving therapies, but they also bring significant new challenges for drug discovery researchers. For example, chemists and biologists need to collaborate more closely on lead optimization. By forming cross-disciplinary teams, chemists and biologists improve their ability to routinely view and interact with therapeutic candidates, and the data associated with them, to make well-informed decisions about the next cycle of optimization.

But how chemists and biologists want to view the candidates and data can vary greatly. When viewing candidate molecules, a chemist will generally want to see a molecular representation, whereas biologists are typically more comfortable with a sequence-based view. Small interfering RNAs and antisense oligonucleotides are certainly small enough to permit a full chemical representation, which is critical for full uniqueness checking in registration systems and rigorous intellectual property (IP) protection.

From a biologist’s perspective, sequence representations for RNA therapeutic candidates can become more complex than those for natural sequences. With such candidates, the nucleobase, sugar molecule, and phosphate backbone typically need to be represented individually so as to effectively communicate where chemical modifications have been made. By accessing a library of nonnatural monomers, biologists can understand the nature of the candidates they’re working on, and the potential effects of additional optimizing modifications.

Dotmatics is a scientific informatics software and services company that automates laboratory workflows for discovery and innovation research. The company notes that in these workflows, informatics systems need to exercise rigor in defining candidate therapeutics and vaccines. Such candidates may include (A) antisense oligonucleotides, (B) small interfering RNAs, and (C) mRNA vaccines. If such candidates are to be developed effectively, informatics systems need to represent chemical modifications to either the nucleobase and/or backbone.

Ensuring IP protection

To work effectively, cross-functional teams require data management systems that support their diverse needs. By implementing scientific informatics systems, these teams can capture and manage drug discovery data, making it accessible to researchers and facilitating its analysis and interpretation. However, implementing scientific informatics systems is a challenging task even at companies exclusively focused on small molecules or biologics. Implementing them is even more challenging at companies that work on cross-functional drug discovery.

As stated earlier, chemists and biologists want to work primarily in their respective domains. When querying and viewing therapeutic candidates, chemists prefer molecular structures and biologists prefer sequences. But this can’t mean that there are dual representations of the candidates maintained within the organization. It is crucial that a single source of truth be maintained to ensure that IP is rigorously protected and to guarantee that all scientists understand each other even if they speak different scientific languages.

Given the current state of the art for scientific informatics, the optimal way to get that single source of rigorous truth is to generate representations of therapeutic candidates that incorporate full chemical descriptions. Such representations will typically be suitable, just as they are, for chemists. For biologists, the representations will need to be faithfully translated into a sequence-based view.

Data accessibility

A single source of truth about therapeutic candidates is useful to cross-functional teams only to the extent that it is readily accessible. And if it is to be accessible, it must be a centralized data repository of registration information. Besides data about therapeutic candidates, researchers participating in drug discovery projects need data of many other types, including experimental data, assay/screening data, inventory data, and analytical data. Data of all these types should be centralized and allowed to flow freely within and, as appropriate, across organizations—provided appropriate access controls and security measures have been implemented.

One way to achieve this centralized data system is to use web-based, software-as-a-service (SaaS) applications in which data are centrally hosted. SaaS systems are common now in drug discovery, with a growing proportion being cloud deployed. And while they don’t guarantee the level of data access that is required for projects to run smoothly, SaaS systems represent a major step in the right direction.

But the reality of drug discovery today is that desktop applications still enjoy extensive use. They are running on the computers of individual researchers. Replacing these applications, which are often beloved by researchers, can be challenging. Researchers may even resist the efforts of information technology (IT) staff—unless it is understood that desktop applications and centralized data management systems can complement each other.

Rather than aspire to IT nirvana—a pure web-based, cloud-hosted system—IT organizations could use application programming interfaces (APIs) and other technological approaches to allow data to be smoothly exchanged between local desktop applications and a central system. For example, if helper applications are deployed on users’ local machines, data from the central system can flow to local desktop applications, where data analyses can be performed and these, and their interpretations, can be uploaded to the central system.

Such a model provides a very happy balance between the desires of the organization’s project team and those of the end users. The project team can improve data management and access, and the end users can keep the applications they know and love—applications with which they are already productive.

Overall, the ability to bridge the gap between cross-disciplinary teams can support a more streamlined workflow for RNA-based drug discovery. Accessible, secure data is key to building mutual understanding and ensuring that the rest of the drug development process comes from this single source of truth. Informatics systems offer much support in all aspects of these challenges, creating a centralized data store that is accessible and understood by both chemists and biologists yet flexible enough to allow use of local applications. By utilizing such systems, cross-functional organizations can limit time wasted translating and managing data, and they can spend more time using that data to discover and develop new and novel RNA therapeutics.


Andrew LeBeau, PhD, is associate vice president of product integrations at Dotmatics.

Previous articleCombining Molecular and Spatial Information to Understand Alzheimer’s Disease
Next articleAppreciating the Size and Length of Clinical Trials for CVD Therapeutics