Off with the training wheels: AI-based patient characterization can improve clinical trial performance without large data sets

[Adobe Stock]

Only 12% of new drug candidates that enter phase 1 clinical development ultimately receive FDA approval. This dismal success rate leaves millions of patients with unmet medical needs and drives up the costs for the small number of drugs that make it to market. More frustratingly, it leaves untold numbers of potentially transformative therapies back-burnered or discarded entirely, not because they don’t actually provide benefit, but because they were tested in trials that weren’t effectively designed to demonstrate benefit. The true failure hasn’t been in drug innovation but in identifying the patient traits that govern clinical trial outcomes.

The big challenges of big data methodologies

Artificial intelligence (AI) holds great promise in improving this success rate by providing data-driven approaches to identifying traits and their combinations that enable more effective paradigms to enrich patient populations for clinical trials. However, most clinical trial AI solutions are data hungry and therefore must rely on massive historical data. One problem with big data AI-based approaches is that they are trained on massive (tens of thousands of samples) historical clinical trial datasets related to a particular disease indication.

These training samples consist of patient populations that are missing nuanced but significant differences that arise through the specifics of a clinical trial, and use variables that may not even be included in already overburdened clinical trials. Further, patient populations change over time with respect to diversity, attitudes, epigenetics, and microbiome alterations, all which can effect drug response. For this reason, models based on large historical data sets need to be complemented by methods designed to specifically deal with much smaller data derived directly from the preceding phase 2/3 trial.

Focus mechanisms provide clearer and more meaningful insights

To learn as much as possible from the trials directly preceding a pivotal trial, it means that we need to be able to learn from smaller data sets. This requires a different way of learning, that is AI that isn’t as dependent on training wheels (i.e. training data) and can identify with high precision the patient traits (and combinations thereof) that drive clinical trial outcomes. A key to this more intelligent approach to AI-based clinical solutions is the ability of the algorithm to do something that is quite difficult for humans — acknowledging when it doesn’t understand something.

In large data sets with many thousands of patient records, it is likely that most of the variability of the disease is captured and so every patient record can be considered to be equally informative. However, this is not true for small patient data sets, as found in clinical trials. Thus the system must be more irreverent with respect to the labels provided by the physicians running the trial — labels such as responder or non-responder. The algorithms used in this setting must be able to fracture the data set into explainable and unexplainable subsets through the use of what we call ‘focus Mechanisms’ because they allow the AI to focus on the parts of the data that it has confidence can influence a trial.

Setting aside unexplainable datasets reduces the noise that drowns out important signals of efficacy, toxicity, and placebo response in the explainable part of the dataset. This in turn enables the identification of the key variables that drive these responses, and improves the training of models that are capable of admitting when they just don’t know how to classify a patient.

While the use of focus mechanisms has the potential to improve clinical trial outcomes generally, it becomes especially beneficial given the increasing interest and regulatory requirements regarding increasing the diversity of clinical trial participants. While diverse cohorts are essential for enabling the development of therapies that can be used safely and effectively in many patients, they are more complex to analyze. As genetic and demographic diversity increases, the ability to focus on explainable subsets of complex but small patient data sets becomes even more critical.

ALS case study

Research published in Frontiers in Computational Neuroscience in January 2024 demonstrate how a novel clinical AI solution (NetraAI, NetraMark Holdings, Inc.) discovered novel amyotrophic lateral sclerosis (ALS) drug targets and unique ALS patient subpopulations that could substantially improve clinical trial success rates in an indication with significant unmet therapeutic need. In this study, this novel clinical AI solution was used to analyze data collected by Answer ALS (the largest collaborative effort in ALS, which brings together multiple research organizations and key opinion leaders) from more than 800 ALS patients and 100 healthy controls from eight neuromuscular clinics distributed across the U.S.

In this study, the clinical AI solution replicated ALS drug targets identified that used large data AI methods, but also identified wholly novel targets. These targets, which can be grouped into classes relevant to ALS, may enhance our understanding of ALS pathophysiology and may enable new approaches to treating ALS. The clinical AI solution also identified specific subsets among 116 ALS patients defined by specific gene expression patterns, potentially enabling new approaches to personalized medicine approaches. Moreover, the identification of these subpopulations may significantly improve clinical trial outcomes by aligning therapeutic and disease mechanisms of action. Importantly, the identification of these subsets within a 116-patient cohort simply is not feasible using big data AI approaches.

From clinical trial to clinical verdict

Joseph Geraci

Current clinical development paradigms take a trial-and-error approach. The error component of this approach has significant financial, time, and human costs. Patients are essential to the clinical trial process, and new data-driven approaches are needed to increase the benefits and reduce the risks to which clinical trial participants are subject. Increasing the likelihood of clinical trial success may also encourage more patients to participate in clinical trials. This is important for improving health outcomes for patients in trials and those who may benefit once the tested therapy is approved. Novel AI-based technologies that can significantly improve clinical trial success are essential for moving from trial to verdict as quickly, safely, and cost-effectively as possible. The journey goes faster when the training wheels are off.

Joseph Geraci, Ph.D., is the CSO/CTO and co-founder of NetraMark, where he merges his expertise in advanced mathematics, physics, oncology, neuroscience, molecular medicine and quantum machine learning. He holds postdocs in machine learning, oncology, and neuropsychiatry. Geraci is associated with the Department of Molecular Medicine and Pathology at Queen’s University in Ontario, Canada, and the Centre for Biotechnology and Genomics Medicine Medical College of Georgia, USA.