Variational AIWhile the topic of AI in drug discovery has received considerable attention in recent years, mature deployments of techniques such as machine learning in the industry remain rare. 

“The chemistry domain is qualitatively different from any other problem that machine learning has exhibited real success in,” said Jason Rolfe, CTO of Variational AI (Vancouver). 

For one thing, there is a relatively limited number of FDA-approved drugs. As of 2018, FDA said it had approved 19,000 prescription drugs

A dataset involving FDA-approved drugs that have been tested in humans would be orders of magnitude smaller than the sort of datasets that underlie Generative Pre-trained Transformer 3 (GPT-3), a language model from OpenAI, an AI research company co-founded by Elon Musk.   

Jason Rolfe

Jason Rolfe

High-throughput screening can generate substantially larger datasets. The PubChem database, which NIH bills as the “largest collection of freely accessible chemical information.” The data there, however, can be noisy. Many of the apparent active compounds likely won’t be validated in a secondary screen due to factors such as aggregation, contamination of samples and assay interference. 

“These datasets are intrinsically more difficult to work with than something like ImageNet, which has been the workhorse for much of the architectural development in machine learning,” Rolfe said. 

ImageNet, a visual database with more than 14 million annotated images, the data are relatively clean. “Some images are misclassified, but it’ll be something like a Shih Tzu classified as a Pomeranian,” Rolfe said. 

By contrast, noise is “rampant in pharmacological data,” Rolfe said. 

Drug discovery is “a very challenging domain to work in, but it has outsized promise,” Rolfe said. 

With the cost of developing a new drug frequently hitting multiple billions of dollars and the failure rate high, “anything that can reduce that by even a fraction would be of extreme value to society,” Rolfe noted.

Handol Kim

Handol Kim

Variational AI focuses on machine learning for drug discovery to generate small molecules that become assets licensed to biopharma companies. 

The biopharma industry is in the “first or second inning” in terms of adopting techniques such as machine learning for drug discovery, said Handol Kim, CEO of Variational AI. 

During the pandemic, skepticism about AI’s promise in drug discovery has begun to fade, Kim said. “Pharma companies are realizing this could be a new potential modality not unlike biotech in the 1970s or 1980s,” he added.

In addition, “a lot of pharma companies are now investing in hiring people to specifically work on AI for drug discovery companies,” Kim said.