Mining Nature’s Diversity for Novel Cas9 Tools

New England Biolabs®, CasZyme, and Corteva Agriscience Collaborate to Explore Biochemically Diverse Cas9 Orthologs

Sponsored content brought to you by

New England BioLabs logo


The discovery of bacterial defense mechanisms from invaders, such as phage, catalyzed the
molecular biology field by enabling restriction enzyme-based molecular cloning. For 45 years, New England Biolabs (NEB®) has made this toolset available to researchers by providing an ever-expanding catalog of enzyme specificities. In 1988, NEB held the inaugural meeting on DNA Restriction and Modification in Gloucester, MA, inviting scientists from all over the world to share their findings on these systems. Among the researchers was Virginijus Šikšnys, CSc, from the Institute of Biotechnology in Vilnius, Lithuania.

Virginiju Siksnys in the lab
Virginijus Šikšnys, CSc
Founder, CasZyme

In the 1980s, as a chemist, Šikšnys became interested in the molecular mechanism of restriction modification (R-M) systems. He reasoned that, by understanding the molecular mechanism of sequence recognition, enzymes could be engineered to target specific sequences. However, despite the hope for site-specific recognition through restriction enzyme protein engineering, this approach proved challenging.

Fast forward 30 years to the discovery that omponents from the CRISPR-Cas bacterial defense system demonstrate site-specific DNA modification, which can be rapidly and inexpensively programmed with readily available nucleic acids. Alongside Jennifer Doudna, PhD, and Emmanuelle Charpentier, PhD, Virginjius Šikšnys, CSc, was recognized for contributions to this body of work with the 2018 Kavli Prize in Nanoscience.

While many of the studies that have led to these developments were performed using CRISPR-Cas9 from S. pyogenes, it is widely known that, like restriction enzymes, great diversity in these systems is found in nature. Last year, NEB announced a collaboration with CasZyme, founded by Šikšnys in Vilnius, Lithuania, and Corteva Agriscience, to explore the diversity of these systems, characterize new CRISPR-Cas nucleases, and commercialize them to make them widely accessible to researchers. An upcoming publication by Giedrius Gasiunas et al. describes the diversity found by screening and characterizing 79 Cas9 orthologs.

Virginijus Šikšnys answered some questions about CRISPR-based technology and what we are looking for when screening candidate orthologs:

What limitations exist with current CRISPR-Cas9 based technologies?

The CRISPR-Cas9 system is now rapidly advancing into clinical settings for the treatment of human diseases. However, there are still important challenges to overcome to enable broader applications of CRISPR-Cas9 technology.

Challenges include:

PAM requirement. To initiate base-pairing, Cas9 requires a short nucleotide sequence called a protospacer-adjacent motif, or PAM. If there is no PAM near the target site, Cas9 cannot access it. Even the most versatile SpyCas9 variant cannot access approximately 75% of the single-base mutations that are associated with human disease, because there is no appropriately located PAM.

Delivery of Cas9 into tissues or cells is required to achieve therapeutic effects, and there are difficulties associated with this. Adeno-associated virus (AAV) is often a preferable delivery choice; however, the packaging limit of AAV vectors is restricted to small Cas9 orthologs, and in most cases these small Cas9 proteins recognize longer PAMs, thereby limiting the regions they can target.

Off-target effects. Large genomes may contain DNA sequences that are identical to or closely resemble the target sequence, resulting in nonspecific cleavage by Cas9 at nontarget sites, yielding undesirable mutations.

Low efficiency of homology-directed recombination (HDR), which is important for
precisely editing or knocking-in genes.

To overcome these potential complications, it is necessary to either redesign existing Cas9 variants or explore the diversity of CRISPR-Cas9 systems found in nature.

What kinds of properties were you looking for in the orthologs you screened, and how could these be helpful to researchers?

We screened Cas9 candidates that represent the amino acid diversity of identifiable orthologs. We then focused on characteristics that begin to address the limitations mentioned above. We reasoned that Cas9 proteins with distinct PAM specificities may expand the sequence space targeted by Cas9 and help to overcome limitations associated with PAM recognition. The orthologs we identified in sequence databases varied in size. We characterized a subset of small Cas9 orthologs that can potentially be utilized for delivery via viral vectors. Off-target effects and HDR are harder to address in the context of large screens and so our work continues on these fronts.

What were the most interesting findings in this work, and what do they tell us about Cas9 biology?

We identified new Cas9 proteins associated with approximately 50 different PAMs that varied both in sequence composition (A-rich, T-rich, C-rich, G-rich variants) and length (2–5 nts). When we computationally analyzed the PAM interaction domains of the 79 different Cas9 orthologs, we concluded that much of this amazing diversity is explained by just four primary PAM interaction domain families. This may help to expand the sequence space targetable by Cas9 and provide blueprints for future engineering efforts to change PAM sequence requirements further.

When we analyzed the guide RNA solutions from our collection, we found that they could be clustered into several orthologous groups. While guide RNA structure and functionality has been previously published by Caribou Biosciences and the Charpentier, Church, and Doudna labs, we have found and added new guide RNA types to this paradigm. In the future, it will be interesting to determine the mechanistic differences between these orthologous groups and establish new multi-Cas9 editing approaches.

The termini of cleaved DNA ends varied among the Cas9 orthologs in vitro—ranging from exclusively blunt-ended, to one nucleotide 5´-overhangs, to staggered ends with two or more nucleotide 5´-overhangs. This observation implies differing arrangements and activities of the nuclease domains within the active sites of the Cas9 orthologs and may have implications for in vitro uses of the enzymes or repair outcomes in genome editing applications—especially homology-directed repair.

Are there applications outside of genome editing where CRISPR-Cas nucleases are useful?

CRISPR-Cas9 nucleases have the amazing ability to search, recognize, and bind to almost any dsDNA sequence, and they are programmable by changing the sequence of the guide RNA. While not technically genome ”editing”, CRISPR-Cas9 variants can be a very useful tool for activating or repressing gene expression by targeting regulatory protein domains to genomes. Additionally, when used to target epigenetic modifying protein domains to specific genomic loci, CRISPR-Cas9 variants can be applied to study mechanisms of epigenetic gene regulation.

Outside of genome editing CRISPR-Cas9 has many useful applications, for instance, extremely selective DNA cleavage in vitro, specific binding, and detection or enrichment of DNA targets in complex—mixtures, just to name a few. Aside from genome editing for therapeutic applications, CRISPR-Cas9 nucleases can also be useful for engineering the genomes of microbes. Some of the Cas9 orthologs that we identified in our study are active at high temperatures, making them potentially useful for genome engineering in thermophilic organisms important in biotechnology and research.

Where do you see CRISPR-based technology in five years?

CRISPR-based technology will continue to move into, and become more commonplace in, clinical applications including engineered cell therapies; likewise, in agriculture. The technology will likely be geared toward the introduction of more refined targeted changes. This is already being observed with epigenome engineering, base editing, and prime editing. We also believe that the ability to make multiple edits simultaneously will be a growing theme. For nongenome editing applications, programmable CRISPR-Cas-based detection technologies will expand because of the ease of changing recognition sequence and their specificity. On this front, we will likely see even more CRISPR-Cas applications in the diagnostics space like Sherlock and Mammoth, for instance, which are two early pioneers of this application.

A greater pool of Cas9 orthologs as a starting point for developing these types of technologies will enable many fields and applications.


For more information about New England Biolabs, CasZyme, and Corteva Agriscience visit:,, and
For more information on this collaboration, visit this bioRxiv article, Biochemically diverse CRISPR-Cas9 orthologs.