A central hurdle in computational protein design is the mismatch between proteins designed in silico and their actual behavior after synthesis. “You can design millions of proteins on a computer over the course of a week or a month, but computational approaches are just not good enough,” explained David Younger, co-founder and CEO of A-Alpha Bio, in a recent interview. That is, after synthesis, the proteins often do not behave as desired.
Furthermore, the throughput for manual protein experiments is typically low. “You might be able to design millions of proteins, but you might only be able to test 10,” Younger said. The company thus built a platform designed to mitigate this bottleneck by focusing on protein binding — a crucial aspect of protein function.
Established in 2017 as a spinoff from the Institute for Protein Design at the University of Washington, the company recently raised $22.4 million in funding in an oversubscribed A2 funding round. “The funding is very much both for building the platform — both wetlab and computational — and progressing the pipeline,” said Younger. The company has raised $51 million to date.
Tackling the protein design challenge
To tackle the challenge of accurately predicting in-vivo protein behavior from in-silico designs, A-Alpha Bio has developed a synthetic biology platform known as AlphaSeq. The platform characterizes entire networks of protein-protein interactions through a “library on library” approach. Younger describes the process: “We can take not just a single antibody but a library of antibodies, and we can take not just a single antigen but a library of antigens,” Younger said. “And then we can cross those two libraries and we can measure all of the pairwise interactions between those two libraries quantitatively.”
The platform also gauges relevant binding strengths spanning picomolar to micromolar affinities. By screening networks of interactions, AlphaSeq provides biophysical binding data relevant to downstream therapeutic applications.
Alongside AlphaSeq, A-Alpha Bio has developed a complementary platform known as AlphaBind that applies machine learning (ML) to AlphaSeq data to optimize the protein design process. Using more than 300 million AlphaSeq affinity measurements, AlphaBind predicts the binding strength of protein sequences.
A glimpse into A-Alpha Bio’s therapeutic pipeline
In terms of its pipeline, A-Alpha Bio is focused on two areas: molecular glues and biologics, particularly immunocytokines. These cis-signaling immunocytokines consist of two protein components: a cytokine and an antibody. The antibody functions to guide the therapy to a specific cell population by binding to a unique cell surface marker. This approach allows for targeted delivery of therapeutic molecules.
For molecular glues, the firm is pursuing target identification. For biologics, it aims to develop novel drugs by modifying cytokines and fusing them to targeting antibodies. For example, in their therapeutic pipeline, A-Alpha Bio is developing “detuned” cytokines by systematically mutating them to reduce potency before pairing with antibodies. This immunocytokine approach intends to optimize safety and efficacy.
Building a large protein-protein interaction database
A-Alpha Bio has created an extensive protein-protein interaction database. “We now have a protein-protein interaction database that has about 440 million measured protein-protein interactions, so a huge amount of data around how proteins interact,” Younger shared.
This database feeds into A-Alpha Bio’s machine learning models, predicting optimal protein sequences. “Instead of making each protein individually, we build libraries of thousands of proteins transformed into huge populations,” explained Randolph Lopez, CTO of A-Alpha Bio. The company then uses the degree of interaction between proteins as a proxy for utility. “Then we employ next-gen sequencing for analysis,” Lopez said.
In terms of the data processing and ML training purposes, the data sets are “quite big,” Lopez acknowledged. Like any company dealing with very large data sets, there are substantial costs related to training models and cloud based infrastructure.
Plans to maintain a competitive edge in synthetic biology
As to how A-Alpha Bio aims to maintain its edge in the rapidly-evolving synthetic biology space, the company plans to lean into its platform, talent and strategic partnerships. “Since the founding of the company, we’ve had a significant platform advantage,” Younger said. “Our models are so sophisticated that they can predict the desired sequences at a much faster pace.”
The company’s talent base would be “very difficult to replicate,” Younger said. The company’s employees have diverse backgrounds in fields like bioinformatics, protein biochemistry, cellular biology and machine learning.
In addition, A-Alpha Bio plans to use the funding to expand its internal pipeline, especially in the oncology field, while continuing partnerships with prominent Big Pharma companies. Recent collaborations include Bristol Myers Squibb, Gilead and Lawrence Livermore National Laboratory (LLNL). The company also has “a couple of top-top tier undisclosed pharma partnerships,” Younger said.
“The way we initially and still maintain a moat is both in terms of the generation capabilities in protein-protein interactions and now the database we’ve built — the AlphaSeq and the data we’ve generated,” Lopez explained. The company has aimed to be strategic in terms of its partnerships. “LLNL was very interesting to us in particular because of their approach for using structure to effectively use data that we can generate and their structure predictions to advance antibody therapeutic development in their case,” Lopez said. “And the other reason it was interesting is their project is very focused on pandemic preparedness.”
A-Alpha Bio prioritizes proactive data generation
A-Alpha Bio is confident that its approach could shed light on infectious disease, including SARS-CoV-2. “If we have a dataset that consists of thousands or millions of different antibodies binding to thousands or millions of different coronavirus variants, we can start to map the landscape,” Younger said. “If a new coronavirus variant crops up, even if it’s one that we’ve never seen before, we can take all of the data that we’ve generated historically and predict a new antibody sequence that’s likely to be an effective drug against that never-before-seen virus.”
In the interim, the company’s platforms, AlphaSeq and AlphaBind,, continually evolves, with the firm benchmarking different machine learning approaches every three months to refine their models and enhance predictive power over time.
“Our long term ambition is for this to become a more and more generalizable model, able to predict binding for entirely novel proteins unrelated to previous data,” said Younger. “It’s not just predicting binders for new coronavirus strains, but for any new protein out there. We’re not there yet, but with the continual data generation enabled by our platform, we believe we have a solid shot.”