Chain of amino acid or bio molecules called protein - 3d illustration

[Christoph Burgstedt/Adobe Stock]

At the Nvidia GTC 2023, DeepMind’s Founder and CEO, Demis Hassabis, provided an in-depth look into the seismic potential of their protein folding AI system, AlphaFold. Hassabis said AlphaFold was a contender for the organization’s “biggest project to date.”

Hassabis noted that DeepMind’s AlphaFold has made strides in addressing the protein folding problem, a challenge that has stumped scientists for more than five decades. The protein folding problem involves predicting the 3D structure of proteins solely from the amino acid sequence. Until AlphaFold, determining a protein’s structure was a complex and time-consuming process, often taking years of experimental work.

At the end of 2022, the company announced that it updated AlphaFold with the structure predictions of more than 200 million proteins.

Hassabis explained the enormity of the challenge, stating, “There are around 10300 possible confirmations [shapes or structures] that an average protein can take.” He pointed out, however, that “proteins in our body spontaneously fold extremely quickly, in a matter of milliseconds.” This paradox indicated that the problem should be solvable. To measure progress in this field, a biennial competition called CASP (Critical Assessment of protein Structure Prediction) has sought to test the protein folding predictions of various academic and industrial labs. The bi-annual competition has been running every two years since 1994.

The significance of AlphaFold

AlphaFold has demonstrated significant progress in tackling the protein-folding problem. Hassabis shared, “Prior to us entering the field with AlphaFold 1 in 2018, and then AlphaFold 2 in 2020, if we look at the decade of progress before that, from 2006 to 2016, on the previous CASP editions, you can see that essentially there’d been no progress for pretty much a decade.”

AlphaFold, a system capable of predicting protein structures with remarkable accuracy, could have far-reaching implications for the future of drug discovery and development. Hassabis believes that this groundbreaking technology will usher in a new era of digital biology, saying, “AlphaFold, I think, is that proof of concept.” He added that the system promises to herald “the dawn of a new era of what we like to refer to as ‘digital biology.’”

AlphaFold is not the only system from a tech company focused on proteins. Scientists at Facebook’s parent Meta have also developed an AI language model known as ESMFold to predict the unknown structures of more than 600 million proteins pertaining to viruses, bacteria and other microbes. The researchers repurposed the model, first designed for decoding human languages, to make accurate predictions regarding proteins’ 3D structure.

As for DeepMind, it is also working on a number of other projects in chemistry and biology to expedite the drug discovery process. Hassabis envisions the development of a “virtual cell” that models all cellular dynamics and can be used to perform in silico experiments. This would streamline the research process, requiring wet lab validation only at the final stage.

The broader impact of AI in research

In addition to AlphaFold, AI systems like ESMFold could make significant contributions to various scientific disciplines and industrial applications, as demonstrated by DeepMind’s work in quantum chemistry, fusion reactor control, rainfall prediction and data center cooling efficiency. The company has also developed innovative technologies such as WaveNet, a realistic text-to-speech system, and AlphaCode, which competes in programming competitions.

The introduction of ESMFold brings competition to the field, which could prompt Meta and DeepMind to push the boundaries of their AI models to improve accuracy and efficiency.

The two companies could both offer useful approaches. That is, ESMFold’s speed advantage could complement AlphaFold’s higher accuracy. Researchers could potentially use ESMFold for initial predictions or for large-scale projects, and then refine the results using AlphaFold for specific proteins of interest. This combination could optimize the research process and maximize the benefits of both models.

Meta’s involvement in the field could also drive more of an open-source approach. For instance, the compilation of ESMFold’s predictions into the open-source ESM Metagenomic Atlas encourages collaboration and sharing of data among researchers. This could pave the way for similar initiatives by DeepMind or other organizations, fostering a more collaborative environment in protein structure prediction.

During his talk at Nvidia GTC 2023, Hassabis also underscored the importance of safety and ethical considerations in AI research, particularly as AI systems become more powerful. DeepMind’s large language model, Sparrow, exemplifies this commitment, designed to be helpful, correct, and harmless while prioritizing safety and rule adherence.

Ultimately, breakthroughs like AlphaFold and ESMFold have the potential to catalyze progress drug discovery and other scientific fields, potentially helping usher in a new era in digital biology. Hassabis’ presentation showcased DeepMind’s ambitious vision, while stressing the need for safety, responsibility and ethics in AI development. As the company continues to push the boundaries of AI and tackle complex challenges related to protein characterization, collaboration and innovation will be key to unlocking the full potential of these groundbreaking technologies.