With a vision of becoming the Amazon of protein design, two of Denovium’s co-founders – Imad Ajjawi and Greg Hannum – assess the ability for deep learning to create proteins from scratch.
Scientists are trained to think there are rules to how everything works. Yet artificial intelligence (AI), in particular deep learning, reveals that sometimes their assumptions might be flawed. For example, a common assumption is that every protein starts with the same amino acid—methionine. With this rule, the only way to annotate genomes for proteins is, well, by finding nucleotide triplets that encode for methionines.
But what if you tried to start proteins without that rule or other rules involved in amino-acid assembly? With end-to-end deep learning, AI will not only likely naturally learn these rules, it may also find something more—rules that we haven’t encountered or nature has discarded, providing a lot of power to explore whole new domains of protein design and generation.
Denovium, named after de novo (i.e. starting from the beginning), is a San Diego-based company that’s using deep learning to turn everything scientists know about proteins on its head to make any protein imaginable. Denovium’s AI can be used to improve any sector where proteins are involved. Deep learning is widely applicable to medicine, where many therapeutics are based on proteins, but it also can make a dent in the agriculture and fuel industries.
The team at Denovium is challenging nature to think in a different way about proteins. Just because nature found one solution, it doesn’t necessarily mean it’s the best solution. Nature could be following a completely different set of rules, synthesizing proteins beyond our wildest dreams in a galaxy far, far away.
GEN Edge sat down with Denovium co-founders Imad Ajjawi, PhD, and Greg Hannum, PhD, to discuss the potential for deep learning in discovering and designing proteins from scratch with nature’s rules reimagined. (The interview has been lightly edited for length and clarity.)
GEN Edge: How was Denovium founded?
Ajjawi: Denovium has created a deep learning artificial intelligence capable of understanding the fundamental properties of protein function. If you rewind four years, we were hearing a lot about deep learning in the news and around our lives. There was much debate whether it was hype or not, but we could see that it wasn’t because of examples like self-driving cars, better image recognition, and deep fakes that we see on social media. For us, it was a no brainer! We thought this is going to be the next disruptive technology in our field too. But nobody was quite doing it in biotechnology yet. There were a few examples in healthcare, more towards image recognition. We asked, “Can we apply the same deep learning neural nets to biological data?”
We decided that a logical starting point is proteins. They carry out cellular functions, and function is what we care about. Just like DNA has its alphabet of four letters, proteins have their alphabet of 20 amino acids. If you think of all the different permutations, the complexity of proteins is huge. The number is astronomical. It’s more than the number of atoms that there are in the whole universe. If you get to the non-natural amino acids, then the complexity is even bigger, and for the things that you can create, the imagination is the limit. That’s why it’s a problem that is so well suited for deep learning. You are never going to be able to test that diversity in the lab. You need to resort to computational solutions, and deep learning is that solution.
The deep learning model that was built was to interpret protein function, and it worked even better than we could have imagined. The first time where we knew that we were onto something super exciting was when we thought that the deep learning model was capable of interpreting the function of proteins with previously unknown function. A third of the human genome is non-functional roughly. The AI was capable of predicting a function on some of those proteins. That was when we knew that we were sitting on something big, and that’s really what drove the impetus to start Denovium.
Hannum: The AI was based on this functional understanding model of proteins. What we knew is that this is just a start, it’s a drop in the bucket of what this AI is capable of doing. So you can annotate proteins, but what else? Can you modify or generate proteins? Can you even go so far as to create proteins to your specifications from scratch? Here at Denovium, we talk about, “Hey Alexa, could you build a protein that is stable and not immunogenic?” It sounds like sci-fi, but that’s exactly what we’re doing and where we think that the future is. It won’t be long before this is a solved problem.
Although our focus currently is on proteins, the same power can extend out to other biological data as well. Now we have a mathematical model that can read any piece of DNA from any domain of life and find protein-coding regions as well as non-coding regions and functionally understand what those do in a single representation. This would normally take bioinformaticians hundreds of thousands of different little mini models that they run and pick out their favorites. Our capability allows us to annotate metagenomes that would normally take months and hundreds of thousands of dollars, instead, on a weekend at home.
GEN Edge: How large is Denovium in terms of manpower?
Ajjawi: We’re a company of six. That’s one of the advantages of deep learning. The whole premise is that this is going to save you time and costs. It’s going to be more predictive than human beings are. It’s going to be faster and at the end cheaper. That is the value proposition that we bring to the companies that we work with.
GEN Edge: What are your backgrounds? What made you want to take this on?
Ajjawi: Something unique about the team of founders here is that everybody has a science background. We’re founded by synthetic biology, genomics, and computational biology experts. Everybody is really in love with the tech and understands how it’s just going to be the next disruptive technology in our field. Think of next-gen sequencing, CRISPR-Cas, how they’ve disrupted biotech. The next thing — and we’re seeing it now — is deep learning. There’s that fundamental excitement and belief that this is the future.
GEN Edge: How do you choose which scientific avenues or challenges to tackle?
Ajjawi: We’re playing across different spaces in the life sciences. We’ve partnered with large pharma, biotech, and animal health. The direction we want to initially focus on is protein therapeutics. Aside from the value that it brings society where we can impact people’s lives positively, it’s also a very appealing market. About eight out of the ten best-selling drugs right now are protein drugs. Most of these are monoclonal antibodies, and there’s a lot of different classes of protein drugs. Even within antibodies, there’s a lot. There are nano-bodies, antibiotics, hormones like insulin, and protein vaccines. The market is trending in that direction. For us, that’s a very appealing opportunity that we’re focusing on right now.
GEN Edge: Does Denovium pick which problems to solve or do they come to you?
Ajjawi: A little bit of both. Although our business model is to partner with pharma companies, large biotech companies, etc. and is driven by the market and where their needs are, we are also proactive to develop technology. You have to pick some targets to work on.
One target that would be a good example to bring up is in the COVID space. Some of the therapeutics that have been developed to treat COVID are antibodies or protein decoys. There have been many publications about how you could modify the binding of the spike protein of the coronavirus to the human ACE2 receptor. You can create protein decoys by modifying the ACE2 receptor, and that way the coronavirus is prevented from infecting the individual. Many publications have done site-directed mutagenesis and created different amino-acid changes to ACE2 to come up with better binders.
We took that data and said, “How does that data look with our deep learning models? Can we predict a higher binding affinity?” We were super excited to see that our deep learning is capable of predicting the right mutations in the right direction. We were able to tell which mutations were going to be higher affinity binders and which ones would be detrimental. That’s a real-life example that wasn’t driven by anybody coming to us. It was just something where there was a lot of public data available out there, and we decided to test this out and try it for ourselves.
GEN Edge: What is your limiting factor? Manpower? Funding?
Ajjawi: All of the above! Right now we’re raising a round of funding. We’re looking for partnerships with key clients as well—strategic partnerships, especially partnerships where we can create an asset together, is definitely within the roadmap.
Then there are the two components of our business – the functional discovery and design of proteins. We envision building this AI that completely understands the fundamental properties of protein function. The tech is there, we even have some models that are already producing exciting protein sequences. But it’s going to take some people to test these out and prove it. We’re a technology company that creates these assets. It’s essentially a protein sequence in the end with the perfect attributes that you would like it to have. We license this back to clients, and they can go ahead and contract the manufacturer for that scale-up.
GEN Edge: What are the major challenges getting in the way of Denovium’s vision?
Ajjawi: The major challenge is getting data that you can trust. The old saying “garbage in, garbage out” applies here too. That’s where our experience for curating large databases and mining genomes comes in so handy. We have taken a lot of time to manually curate these databases and the way that we get our data. We use a combination of public and private data coming from the partners. We take public data and then restructure it so that it can be input into our Denovium artificial intelligence. We then consider it a proprietary data set. Through partners, they provide their data so we can help answer their specific problem.
Right now, one of our bottlenecks and part of the reason why we’re fundraising, is that we want to add wet-lab capabilities. This will allow us to test our predictions a lot faster. That has been a bottleneck for us so far where we’re a hundred percent dependent on the partners for it. We’d like to add our wet lab capabilities to be producing some of those proprietary data sets as well. That would set us apart from the competition.
GEN Edge: What is your ideal trajectory?
Hannum: From a technical standpoint, the short term is going to be expanding the scope of the AI. This way it can understand more properties that have an interest to our clients. We’re interested in increasing the AI’s cognitive capabilities so that it can make deductions faster with fewer cycles back in the laboratory in the longer term. This is going to be taking not just understanding of protein to the extent that you can annotate it or even that you can modify it and understand what that does, which is something that’s already pretty good at, but reversing the problem into, “I want X, Y, and Z, give me the end of the protein sequence.” We have prototypes of that.
From the business front and the vision, we like to use the analogy of Alexa. We see ourselves as being the Amazon of protein design. We have a large goal in front of us, and we firmly believe that we can accomplish that. Again, in the short term, the focus is going to be on drug discovery and drug development, protein therapeutics, and in the next then we’ll apply that playbook to other areas of the life sciences, like food and agriculture. We’re excited to have people come and collaborate with us. Let’s change the world one day, one discovery at a time!