Sam Sinai, PhD, Dyno Therapeutics Co-Founder and Machine Learning Team Lead

Over the past decade, adeno-associated viruses (AAV) have emerged as the safest and most popular delivery vehicle for the booming field of gene therapy. But there’s a catch: Some researchers have reported that at least half of the general population cannot benefit from AAV gene therapies because they have pre-existing immunity to the naturally occurring forms of those delivery vectors. Hence the interest from researchers in getting over the hump of pre-existing immunity to AAV vectors.

Among companies tackling that challenge is Dyno Therapeutics, a three-year-old startup applying artificial intelligence (AI) to develop gene therapies. Last year, Dyno announced collaborations with Novartis and Sarepta Therapeutics worth potentially more than $2 billion to develop gene therapies for eye and muscle diseases, respectively, based on Dyno’s CapsidMap™ platform.

CapsidMap uses AI to design novel capsids that confer improved functional properties to AAV vectors. In October 2020, Dyno announced a third collaboration of potentially $1.8 billion-plus with Roche and its Spark Therapeutics subsidiary to explore CapsidMap-based gene therapies for central nervous system (CNS) disorders.

Earlier this month, three of Dyno’s six co-founders—George M. Church, PhD, of Harvard Medical School; CEO Eric D. Kelsic, PhD; and Machine Learning Team Lead Sam Sinai, PhD—joined six colleagues from the company, Church’s lab, Harvard and its Wyss Institute for Biologically Inspired Engineering, Google Research, and University of Cambridge in detailing how they applied a computational deep learning approach to design highly diverse capsid variants from an AAV virus.

In “Deep diversification of an AAV capsid protein by machine learning,” published February 11 in Nature Biotechnology, researchers reported successfully using AI to generate an unprecedented diversity of AAV capsids, in order to identify functional variants capable of evading the immune system. Focusing on a 28-amino-acid segment, investigators generated more than 200,000 variants of the AAV2 wildtype sequence, which yielded some 110,000 viable engineered capsids, half of which surpassed the average diversity of natural AAV serotype sequences with 12-29 mutations across the region.

Sam Sinai discussed the study, and Dyno’s research into overcoming immunity to AAV capsids, with GEN Edge in the following interview (lightly edited for space and clarity).

GEN Edge: Can you begin by discussing the just-published study?

Sinai: This particular study is an application of machine learning models to diversify AAV gene therapy vectors. We want to make better AAV capsids, but one of the challenges AAV capsids face is that the natural serotypes have been seen by immune systems for most people, so many people are ineligible to receive them.

We wanted to change the natural serotypes sufficiently that there are different from what’s seen in nature, while preserving their functionality. When we started to study this in 2016-17, it was not known if we could change a protein for more than 10 mutations or so, and keep its functionality intact in a synthetic way. We used a machine-learning approach, hoping that even though we were making so many changes to the gene therapy vector, we’d still keep it functional.

George Church, PhD, of Harvard Medical School, Dyno Therapeutics Scientific Co-Founder

We started this collaboration with three collaborators from Google and ourselves at the Wyss Institute—myself, Eric [Kelsic, PhD] and the other Dyno co-founder George [Church, PhD], and designed this study to really test the power of machine learning to be able to make modifications to a capsid.

What we’ve generated is a huge collection of viral vectors. Our methods were successful in giving a very high yield of successfully modifying the capsid, and many of them were actually far, far deeper into the sequence space than the difference between any two natural serotypes.

GEN Edge: How were the capsids made different?

Sinai: In 2019, we published a paper in Science that focused on trying to modify the capsid everywhere, but just a tiny bit everywhere, and then try to use a simple model to see if we can improve its transduction.

In this study, we took it in a slightly different direction. We picked a region of the capsid that was representative and relevant; it both had properties for immune evasion and viral transduction. It’s a 28-amino-acid region. Then, we introduced mutations which are modifications to the sequence, by either swapping an amino acid with something else or inserting an amino acid between two amino acids. This way we would modify that region of 28 amino acids, which was both challenging and relevant into a region that looked different, and many of our variants that we produced were up to 29 modifications, either substitution or insertions different than what was already in the capsid, and yet they would viably package into a capsid that was functional.

GEN Edge: Why was this 28-amino-acid segment chosen for study?

Sinai: First, because it had some functional relevance. We knew that this is a region that many antibodies might target and has a role in attaching to particular receptors. Finally, it was a region of sufficient complexity that we could do this proof of concept work on to ensure that our methods are general and powerful enough to do it anywhere else on the capsid.

GEN Edge: How usable were the variant capsids that were created?

A 28 amino acid peptide within a segment of the AAV2 VP3 capsid protein that exposes the AAV capsid to neutralizing antibodies produced by individuals, and thus can be the cause of an immune response against the virus. Buried deeper in the capsid are the more purple-colored portions of the peptide, while the yellow portions are exposed on the virus’ surface. [Wyss Institute at Harvard University, original by Drew Bryant].
Sinai: When we started the study, no one knew if we could change it so much and still make a virus that is functional, at least in vitro. This study doesn’t go so far as trying the capsids, the viruses in vivo. However, those are natural next steps that we are continuing to work on at Dyno.

One of the main selling points of Dyno is that we do things in vivo. We actually try this whole pipeline of designing with machine learning, and then generating capsids that are interesting, and then putting them into the animals, and then reading the results out of the animals.

So, this study, based on our initial work at Harvard, stops at the in vitro stage. We do consider these capsids of high potential because they are modified in a specific region. But there are no results out yet that we can share, about whether particular viruses in this group are good for other tasks that they were not designed for.

GEN Edge: Is there any thought to correlating that the variations will work for a given part of the body or organ?

Sinai: Excellent question. I would say that this requires follow up experiments, which highlights why the in vivo work is so important. If we don’t have measurements from in vivo, we cannot report any correlations about what patterns are useful for one part of the body versus the other one. We do think that that is important and our work at Dyno indicates that that’s a very good direction. But it’s not what this study would cover.

GEN Edge: According to your latest study, more than 201,000 variants were generated from AAV2 yielding almost 111,000 viable engineered capsules. How infinite is the potential diversity of functional capsids?

Sinai: That’s a great point. Even just limited to this 28-amino-acid region, the space of possible variants is extremely big. It’s more than the atoms in the universe! So it’s impossible to test all of them. That’s where the power of machine learning comes. Machine learning helps you detect those areas of the sequence space, the possible hypothetical sequences that have a higher probability of success.

Different machine learning methods have different abilities to recognize different regions, for which the virus is likely to be successful. That’s one of the contributions of the study, that depending on the type of machine learning you use, you might find different types of diversity, you might succeed with multiple different machine learning models, but each of them have a different diversity profile. I cannot tell you what portion of this infinite space is functional virus. If you change the protein enough, it might turn from a virus into a different protein in the muscle. You could change it so much that it’s completely doing something else, right?

But the notion that these methods unlock a very large amount of diversity that was not available before is absolutely correct. Moreover, once you verified that the models have a yield of, say, 57% of the capsids we intend to make actually packaged, it also means that you can try millions of sequences in silico, that is, you just designed them with a computer, and you have a higher than 50% chance of knowing that these are actually correct. This again comes with caveats like how far you go from the data set that we have impacts how well you perform, But the key takeaway is, all of a sudden you can screen millions of variants without having to do a single experiment to figure out if it will package or not.

GEN Edge: Could each person have their own individualized capsid to get around the immune system, or is there a more limited number of template capsids, where there’s a basic pattern with maybe a variant in one spot or another?

Sinai: I’m really glad you asked that. We are actually thinking about that very hard at Dyno. Both of these directions are possibilities. Machine learning can help you go towards personalized medicine, where you reduce the cost of personalizing a capsid for an individual. Obviously, you need information from those individuals. You can also think about repertoires of capsids that have different probabilities of working, depending on what population the patient is coming from.

One direction that we are working on in building models is a better ability to fine tune the product to make sure it succeeds in a particular group of patients and, eventually, a single patient like we are confident and what we designed is going to work for this patient.

GEN Edge: Why this was previously unreachable sequence space?

Sinai: If you look at the work that was done in the previous paper, one of the things we showed, and we also show in this paper is, if you randomly generate mutations—one method that’s commonly used is directed evolution—that they randomly generate modifications to a region of a protein they’re trying to engineer. Most often, beyond 4, 5, or 6 mutations, you do not have the ability to keep the functionality of the protein intact.

We have data that shows that if you randomly mutate the region that we are modifying beyond five mutations, you end up with basically nothing that works; less than half a percent of things would work. And what we show is that we get mutations up to 20, 24 25 percent. A significant proportion of what we design is actually working. So, based on the data that we had before, and also knowledge from elsewhere that most individual mutations even tend to be deleterious to the original protein, we estimated that it’s really hard to make anything that’s farther out without being intelligent about it.

GEN Edge: How were the gene therapy vectors created and engineered?

Sinai: The gene therapy vector we work on is a special type of protein. It’s an assembled ball of multiple individual units. So, 60 of these come together and make a ball, and that’s what carries the gene into the particular tissue. It’s not a single protein. As a result, it’s actually quite hard to model it with normal biophysical models, because it comes with a lot of complexity. If you make a tiny change in it, it’s unclear how it affects all of that complexity, and you can’t tell the difference between small changes. Traditional methods that didn’t use machine learning actually had a hard time modeling this. This is extremely challenging to do.

People often use data from similar proteins—they take one AAV serotype, and then take all the related members of that family—and then ask, How much diversity do I have to be able to use it to modify this region? It turns out there are not that many AAVs available, and the average distance between two AAVs is about 12 mutations.

What we show here is that using our method, we actually can get far outside the range of diversity that we would get just by using the information available from other AAV sequences. So, because there are few sequences available, and the average distance between them is not that large, there is a need to use something different to generate the diversity that we have, and that’s one of the reasons this study is significant.

GEN Edge: Is there anything in your research that gives any insight as to immune response to capsids differing by portion of the body? Is the eye more resistant than the liver, for example?

Sinai: We do expect differences to happen between different regions of the body. What we are focused on at Dyno is solving the problem of systemic delivery. That’s a barrier that we have to overcome, regardless of what goes down in particular downstream tissues that we target. And that is still a challenge.

Overcoming that barrier is the first step of this goal that we will need to work on. However, we have a lot of promising indications that, based on measuring transduction in vivo, there is a difference between how the protein variant looks like, and how well it can end up in any particular tissue.

GEN Edge: How does Dyno plan to apply its discovery? Would this be for custom designed capsids?

Sinai: This work, along with the work that was presented in the earlier Science paper, were basically prototypes and proof of concepts that we started at Harvard. We knew these were promising directions; that’s one of the reasons we were confident that Dyno was a good proposition. We definitely have evolved and improved based on the knowledge we gained from these experiments.

We build much more complicated models now that can expand our ability to design viruses for multiple properties, transduction, immune evasion, and multiple of these at once. And we are looking to analyze and design larger libraries and see how well they do in the full in vivo cycle. That’s what Dyno is working on now, and we are definitely expanding on this. This is work that, basically, is at the infancy of showing what we can do. It’s not the full thing that we are currently doing.

GEN Edge: Last year, Dyno announced collaborations with Novartis and Sarepta Therapeutics, as well as with Roche and Spark Therapeutics. Any updates on those partnerships since the announcements?

Sinai: We are very happy with our partnerships. We are making progress, but I don’t have anything that we can share publicly about those at this time.

GEN Edge: The corporate partnerships involve applications of CapsidMap against various diseases. Given the variety of areas being studied with your partner, is it fair to view CapsidMap as disease agnostic?

Sinai: We definitely consider the specific needs of diseases that we are at least indirectly targeting, so it is not completely disease agnostic.

It would be more disease agnostic than methods that are specific for any particular disease. We are more general than that. But definitely, our methods and our philosophy is aligned with thinking about specific needs of patients. So, I wouldn’t characterize it as disease agnostic.

At some point, we will consider the constraints that a specific disease requires. For instance, if the disease requires a payload that has a different size than other diseases in the same tissue, we are able to engineer for that.

GEN Edge: What sort of growth does Dyno anticipate this year?

Sinai: We are about 45 people full-time, and we have doubled about every six months since inception. We are extremely lucky to have grown well in this challenging time. The company is really growing at a pace that is exciting, and many opportunities are arising. We are looking forward to continue to grow.

Previous articleImmunai Expands from Observational to Functional Genomics
Next articleImproving Cancer Immunotherapy by Blocking Glucose Supply