Beyond Legacy Tools: Defining Modern AI Drug Discovery for 2025 and Beyond
In this report:
- Intro: The New Framework
- “AI Drug Discovery” is About Holism
- "AI Drug Discovery" is About Building Software
- Access to Data is King
- Validation is Critical for AI Drug Discovery Platforms
- Grand Vision of AI Drug Discovery for 2025 and Beyond
Disclaimer
This report aims to provide an educational, balanced, and pragmatic perspective on AI-driven drug discovery (AIDD). No part of this report should be construed as promotional content or marketing communication.
Some companies featured are past or current clients, and certain organizations provided factual input during the research process. All analysis and conclusions were developed independently to ensure objectivity.
This report does not constitute investment advice or an endorsement. While we strive for accuracy and neutrality, we accept no liability for decisions made based on this content. Readers are encouraged to conduct their own due diligence.
In 2025, it seems there is still a lack of a robust definition of an emerging category of artificial intelligence-driven drug discovery companies (hereinafter, AIDD).
The purpose of this report is to suggest a qualitative framework for classification of AIDD companies, combining the four key attributes that define the leading players in this area:
- Focus on holism vs reductionism in biology
- Creating robust AI platforms (software)
- Priority of data acquisition
- Technology validation (via demonstrable ability to discover novel targets, discovery and develop clinical-grade drug candidates rapidly, a track record of platform partnerships, scientific publications, patents, and so on)
We will delve deeper into framework discussion below, but in a nutshell, it boils down to this:

Indeed, abstracting from specific characteristics of a tech stack and platform design, there are three key value points of an AI platform on business outcome:
Is a computational platform scalable and robust enough to impact the R&D workflow, people collaboration patterns, and daily decision making of a wide range of specialists of a given organization to make a productivity difference?
Is it able to represent biology in silico down to sufficient depth, but also sufficient breadth to be able to grasp relevant and useful dependencies, patterns, network biology effects, to be able to impact scientific decision-making beyond mainstream research workflows?
Is the AI platform capable of addressing the above two questions in a repeatable, stable, standardized way across all levels of R&D workflows in the organization? Would a third-party collaborator be able to get sustainable value out of using the AI software if they had access?
In our opinion, AIDD is about being able to answer “yes” to all three questions. This is what makes the AIDD platform a tangible business asset.
“AI Drug Discovery” is About Holism
As we explore the newly suggested framework, one key distinction emerges: the difference in what we attempt to model and represent computationally in today’s AI-driven landscape versus what was typically addressed using earlier generations of computational tools.
A helpful starting point is to consider the conceptual gap between traditional software—developed decades ago and still widely used in drug discovery for specific tasks—and modern AI-enabled platforms that are increasingly positioned as end-to-end solutions. While both types of tools play valuable roles, their underlying philosophies differ significantly.
In simple terms, “traditional” or “legacy” cheminformatics and bioinformatics rely on human-driven approaches: cheminformatics uses predefined chemical descriptors (like molecular weight or logP), statistical methods and some machine learning approaches for tasks like QSAR modeling and docking, while bioinformatics applies statistical methods, including dimensionality reduction techniques, to analyze complex biological datasets (e.g., genomics, proteomics) and uncover potential drug targets. These methods are hypothesis-driven, modular, and work with smaller, well-structured datasets.
Conceptually, legacy computational systems and simpler machine learning methods are useful in the paradigm of “biological reductionism.” And they do a great job there, even today.
Classical reductionist approach example is structure-based drug discovery, where it is believed modulating a specific protein is an answer to a drug discovery problem (it sometimes is). The computational part, therefore, is mostly focused on narrow-scope tasks like fitting a ligand into a protein pocket (docking), or, computationally identifying a new type of chemistry for a given target (ligand-based virtual screening).

In stark contrast, cutting edge AI-driven drug discovery companies attempt to shift to a systems biology level, a hypothesis-agnostic approach, using deep learning-based systems to integrate largely multimodal data (phenotype, omics, patient data, chemical structures, texts, images, etc.) to construct complex and comprehensive biology representations (e.g. “knowledge graphs”).

For example, the scientific underpinnings of Pharma.AI computational platform by Boston-based Insilico Medicine are rooted in a novel combination of policy-gradient-based reinforcement learning (RL) and generative models, enabling multi-objective optimization to balance parameters such as potency, toxicity, and novelty.
According to the company, a target identification PandaOmics module leverages 1.9 trillion data points from over 10 million biological samples (including RNA sequencing and proteomics) and 40 million documents (such as patents and clinical trials), using NLP and machine learning to uncover and prioritize novel therapeutic targets.
The Chemistry42 module applies deep learning, including generative adversarial networks (GANs) and reinforcement learning, to design novel drug-like molecules optimized for binding affinity, metabolic stability, and bioavailability.
In the context of clinical development, inClinico predicts trial outcomes using historical and ongoing trial data, offering insights into patient selection and endpoint optimization.
On an algorithm side of things, Pharma.AI incorporates advanced reward shaping, allowing it to fine-tune generated molecules to specific target profiles or polypharmacological goals. Additionally, Insilico emphasizes the use of knowledge graph embeddings, which encode biological relationships — such as gene–disease, gene–compound, and compound–target interactions — into vector spaces.
These embeddings are augmented by attention-based neural architectures, inspired by transformer models, to focus on biologically relevant subgraphs, refining hypotheses for target identification and biomarker discovery.
The platform employs a continuous active learning and iterative feedback process, retraining models on new experimental data, including biochemical assays, phenotypic screens, and in vivo validations, to accelerate the design–make–test–analyze (DMTA) cycle by rapidly eliminating suboptimal candidates and enhancing lead generation.
Furthermore, the platform’s multi-modal data fusion integrates textual information from published literature, patents, and clinical trial data with omics-level insights and chemical libraries. To this end, Natural Language Processing (NLP) models are used to extract relevant biological context and side-effect annotations from these textual sources, which are then enriched with phenotypic screening data, enabling a holistic view of the drug discovery process.
You can familiarize yourself with some of the aspects of the Pharma.AI platform by reading a recent paper “A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models” (image below is from the paper).

Another relevant example of what can be classified as an AI drug discovery approach is Recursion’ OS Platform.
The Recursion OS is a vertical platform of diverse technologies that enables the company to map and navigate trillions of biological, chemical, and patient-centric relationships utilizing approximately 65 petabytes of proprietary data.
According to a commentary by Recursion, OS integrates ‘Real World’ data generated in their own wet-laboratories or by select partners and a ‘World Model’ which is a collection of AI computational models they also build in-house. Today, their scaled ‘wet-lab’ biology, chemistry, and patient-centric experimental data feeds their ‘dry-lab’ computational tools to identify, validate, and translate therapeutic insights, which they can then validate in the wet-lab. The Recursion OS is powered by BioHive-2, what company claims to be the fastest supercomputer wholly owned and operated by a biopharma company.
While different from Insilico Medicine in model architectures and workflows, Recursion is, however, focused on the same key objective: to create a comprehensive representation of biology to be able to mine crucial insights for drug discovery:

Key models of Recursion OS include Phenom-2, a 1.9 billion-parameter ViT-G/8 MAE trained on 8 billion microscopy images, achieving a 60% improvement in genetic perturbation separability, according to company claims.
MolPhenix, winner of NeurIPS 2024 Best Paper, predicts molecule-phenotype effects with a considerable improvement over baselines. MolGPS, a 3-billion-parameter model, excels in molecular property prediction and integrates proprietary phenomics data, outperforming benchmarks in 12 of 22 ADMET tasks. MolE, trained on 842 million molecular graphs, leads in 10 of 22 ADMET tasks.
An interesting component of Recursion OS, is a knowledge graph tool that evaluates promising signals found by the Recursion OS through a complex lens of topics of interest in biology and drug discovery – including global trend scores, protein pockets and structure, competitive landscape, and clinical trials. The knowledge graph allows researchers to perform “target deconvolution” – identifying and validating the molecular targets of a small molecule's phenotypic responses – in order to narrow those hundreds of possibilities into the best target opportunity.
A more recent example comes from a California-based Iambic Therapeutics, founded in 2019. The team at Iambic developed a drug discovery platform that integrates three specialized AI systems—Magnet, NeuralPLexer, and Enchant—into a unified pipeline that computationally spans molecular design, structure prediction, and clinical property inference.

Magnet generates synthetically accessible small molecules by leveraging reaction-aware generative models constrained by Iambic’s automated chemistry infrastructure. These molecules are passed to NeuralPLexer, a multi-scale diffusion-based generative model that directly predicts atom-level, ligand-induced conformational changes in protein-ligand complexes using only protein sequence and ligand graph as input. The resulting structural complexes inform both target engagement and binding specificity.
Finally, Enchant uses a multi-modal transformer architecture trained across diverse, noisy preclinical datasets to predict human pharmacokinetics and other clinical outcomes via transfer learning, achieving high predictive accuracy even with minimal clinical data. This architecture enables an iterative, model-driven workflow where molecular candidates are designed, structurally evaluated, and clinically prioritized entirely in silico before synthesis.
Finally, there is a notable example from the area of neurodegenerative diseases, Verge Genomics. The CONVERGE® platform developed by Verge is an end-to-end, closed-loop machine learning system that integrates large-scale human-derived biological data with predictive modeling.
At its core, CONVERGE® leverages high-dimensional, multi-modal datasets—including over 60 terabytes of human gene expression and inferred gene relationships, thousands of gene perturbation and ChIP-seq studies, millions of protein-protein interactions, and direct-from-human clinical samples across diseases such as ALS, Parkinson’s, and FTD.

These data are used to train machine learning models that identify and prioritize drug targets with increased translational relevance, avoiding reliance on animal or artificial cell models that poorly mimic human biology. Predictions from these models are experimentally validated in-house using Verge’s wet lab infrastructure, forming a feedback loop that continuously refines both biological hypotheses and model performance.
This integration of patient-derived tissue data, mechanistic genomics, and computational target prioritization is aimed at the identification of clinically viable drug candidates without brute-force screening. Verge’s internally developed clinical compound was derived entirely through CONVERGE® in under four years, including target discovery stage.
Conceptually, “AI drug discovery”, in contrast to “legacy” computational systems refers to a modern computational tech stack, usually a multimodal ensemble, that is capable of modeling biology holistically, including molecular, phenotypic, and clinical data of all types and sizes (chemical, omics, text, images (e.g. cell staining), EHR, etc.) — all at once, or substantial part of variety.
Generative AI
Another crucial aspect differing modern AIDD from earlier computational tools is generative capabilities.
While companies like Insilico Medicine pioneered the use of Generative Adversarial Networks (GANs) for generative chemistry back in 2016, by leveraging their ability to model complex molecular distributions and propose novel chemical structures, it is the introduction of transformers and attention mechanisms in 2017, particularly with the advent of models like BERT and GPT, that in our opinion rendered a paradigm shift of generative modeling across domains.
We consider 2017 as a pillar year for generative AI, including chemistry and biology, after the landmark paper “Attention is all you need”.

These architectures, pioneered by Google, and later developed by OpenAI, Anthropic, Mistral AI, and others, demonstrated unparalleled scalability and capacity for capturing long-range dependencies in sequential data.
By pretraining on vast corpora of text (hundreds of billions and even trillions of parameters) and employing self-attention to dynamically weight input relationships, transformers enabled large-scale generative models such as GPT-3 and GPT-4 to generate highly coherent and contextually accurate outputs.
Yes, “hallucinations” are still a major issue. But the shift is paramount, nonetheless. The pioneering commercial products in this regard are ChatGPT for primarily text-to-text generation, Midjourney for text-to-image generation, and many others for text-to-video, text-to-music, etc.
The emergence of practically feasible transformers and large language models catalyzed a sort of race in computational chemistry and biology towards so-called foundation models. The article 19 Companies Pioneering AI Foundation Models in Pharma and Biotech summarizes some of the initiatives in this domain.
To summarize, here is a simple generalizable framework to draw a silver lining between legacy CADD and modern AIDD:
Table 1
Dimension | Traditional Chem(Bio)informatics | AI Drug Discovery |
---|---|---|
Primary Focus |
Methodical QSAR, structure-based design, library searches |
Automated, data-intensive predictions and/or generative output, end-to-end optimization, novel hypothesis generation, biology scoring, etc. |
Core Techniques |
- QSAR (linear/non-linear models) - Docking & virtual screening - Descriptor-driven modeling |
- Deep learning (CNNs, GNNs) - Generative models (VAEs, GANs) - Transformers, attention algorithm - Active learning, reinforcement learning |
Feature Engineering |
- Heavily reliant on manually crafted descriptors - Traditional molecular fingerprints |
- Automated feature extraction from raw data (e.g., molecular graphs) - Learns non-obvious patterns |
Data Sources |
- Limited to known chemical and structural data - Smaller curated databases |
- Integration of large-scale multi-modal data (omics, real-world evidence) - Massive virtual libraries - Synthetic data |
Generative Capability |
- Rule-based or library-based enumeration - Similarity-driven searches |
- Machine learning–based de novo molecule generation - Novel chemistry exploration |
Scalability |
- Often constrained by computational cost of docking or QSAR on moderate-sized libraries |
- Designed to handle billions of compounds or biological data points in silico - Cloud-based, high-throughput pipelines |
Human Involvement |
- Significant expert intervention needed (e.g., choosing descriptors, scoring functions) |
- Reduced manual involvement through automation - AI suggests experiments and molecules for validation |
Integration Across Stages |
- Typically used as isolated tools (e.g., for docking or property prediction) |
- Can form an end-to-end platform (target ID to lead optimization, to clinical trial optimization ideas or predicting clinical trial success) - Real-time feedback loops |
Scope of Insights |
- Narrowly focused on chemical structures and known SAR rules |
- Deeper pattern recognition across complex, high-dimensional datasets - Potential for discovering novel biology and chemistry, novel hypotheses |
Value Proposition |
- Proven track record for well-known targets and chemical series |
- Potential for identifying breakthrough hypotheses, targets, biomarkers, and molecules, as well as diagnostic solutions - Accelerated and more efficient R&D cycles |
Next, as we have reviewed what “AI drug discovery” attempts to model (holistic biology vs mainstream “reductionism”), and what kind of models are generally capable of doing so, let’s discuss another crucial aspect — AI platform “maturity” as a software product.
"AI Drug Discovery" is Also About Building Software
A characteristic feature of leading AIDD platforms versus “superficial” AI-companies, is the demonstrable focus on building actual software, a simple but somehow overlooked observation by many analysts, journalists, and commentators in this area.

We should expect that AIDD company has to be able to demonstrate the presence of a robust, self-contained software platform that supports critical functionalities—ranging from user-friendly interfaces (GUI) for data input and parameter tuning, to configurable machine learning modules (including algorithm selection, hyperparameter adjustment, and visualization of model performance).
Such a platform should integrate standardized data ingestion pipelines (e.g., for omics data, small-molecule libraries, or clinical metadata) with back-end components enabling dynamic model training, validation, and iterative optimization (e.g., active learning, reinforcement learning loops).
A well-documented application programming interface (API) is also essential for interoperability with external tools, ensuring end users can automate workflows and seamlessly exchange data between software components. Additionally, a proper end-to-end solution should incorporate security, data integrity measures (version control, audit trails, encryption), and deployment options (on-premises or cloud-based) to fit diverse organizational needs.
For platforms that are meeting the definition of “AI drug discovery” suggested by a new framework, you can actually see and even access a demo of their software, and in some cases, like with Insilico Medicine, Schrodinger, OWKIN, Iktos, CytoReason, BenchSci, and others — license it and use it for your internal projects.
We were unable to identify information about software characteristics or live demos from the overwhelming majority of companies claiming to be AI-driven businesses.
In the context of AI-driven drug discovery (AIDD), the maturity of a company’s software platform is not a minor detail — it’s foundational. This is because no AI solution currently exists that can independently produce a clinical-grade therapeutic at the push of a button. Despite impressive advances, today’s AI systems serve primarily as intelligent co-pilots — tools that support, rather than replace, the expertise of human scientists.
Given this supporting role, the real value of an AI system lies in how seamlessly it integrates into a company’s internal workflows. To have a tangible impact on R&D productivity and innovation quality, the software must be more than a set of models or interfaces — it must be a mature, interoperable platform capable of scaling across the organization. Without that level of software sophistication, the promise of AI in drug discovery remains largely theoretical.
AI drug discovery (AIDD) companies operate at the intersection of software and life sciences, yet their valuation and benchmarking remain largely pharma-centric. Current assessments prioritize clinical pipeline progression, regulatory milestones, and wet-lab validation, while largely overlooking key software-driven metrics such as model accuracy, algorithmic scalability, data ownership, and compute efficiency.
Given that many AIDD firms generate value through AI-powered platforms, predictive analytics, and proprietary datasets, their business potential should also be measured using software industry methodologies—such as revenue from AI-as-a-service models, IP valuation of proprietary algorithms, and cloud-based scalability. A more comprehensive benchmarking approach should integrate both software and pharmaceutical industry frameworks to more accurately capture the diverse value propositions of AIDD companies.
Access to Data is King
We expect 2025 to bring increased momentum in areas related to data generation, integration, and applied analytics — particularly across technologies like next-generation sequencing (NGS), advanced proteomics, mass spectrometry, cryo-EM, organ-on-chip systems, and robotics-enabled laboratories. These technologies are foundational to enabling richer, more comprehensive datasets for use in drug discovery and translational research.
In this context, companies such as Tempus represent a broader class of data infrastructure providers. Tempus focuses on aggregating and structuring clinical and molecular data, and developing software systems to support data accessibility and clinical decision-making. Their platform is used by healthcare institutions and life sciences organizations to inform diagnostics, treatment choices, and research efforts. As the volume and complexity of biomedical data continue to increase, such platforms may become central to integrating real-world and experimental datasets in support of AI-driven discovery workflows.
From early on, AI drug discovery companies like Insilico Medicine, BPGbio, Recursion, and more recently, NOETIK, have been investing in data acquisition as a cornerstone of their assets valuation.
For instance, Insilico Medicine's so-called “6th-generation” intelligent robotics drug discovery laboratory, launched in December 2022, integrates AI-powered decision-making with fully automated robotic modules for target discovery, compound screening, precision medicine development, and translational research.

According to company claims, by combining its Pharma.AI platform with six functional modules—spanning automated cell culture, high-throughput screening, next-generation sequencing (NGS), and high-content imaging—the lab forms a closed-loop system that validates novel targets, optimizes lead compounds, and generates high-quality biological data to train and refine AI models.
In another example, Recursion's data foundation includes 65+ petabytes of proprietary multiomics data, such as phenomics, transcriptomics, proteomics, ADME, InVivomics, genomics, and patient data. Internally, they’ve generated ~36 petabytes via 2.2 million weekly high-throughput experiments, using CRISPR-Cas9 editing and Brightfield imaging to create one of the largest pharma-related datasets. This data is embedded using AI models for advanced biological analysis.
According to an exclusive interview with a company representative, Recursion processes 2.2+ petabytes of transcriptomics data and integrates 20 petabytes of patient data from Helix and Tempus, covering whole genome and exome sequencing from hundreds of thousands of cases.
In cell manufacturing, Recursion produces 1 trillion hiPSC-derived neuronal cells, creating the "Neuromap" for neuroscience and oncology programs with Roche and Genentech, spanning 40 therapeutic programs.
Next example, CytoReason constructs its data foundation by integrating extensive public and proprietary datasets, encompassing bulk and single-cell transcriptomics, proteomics, and clinical data, into a unified AI-driven Disease Model Platform.
This platform employs advanced machine learning algorithms to map and compare treatments, patient groups, and disease mechanisms at cellular and molecular levels, enabling comprehensive analyses across various diseases and tissues.
Yet another example is Berg Health (now BPGbio), which established an extensive biobank comprising over 100,000 clinically annotated human specimens, including biofluids and tissue samples, to fuel their AI-driven drug discovery platform.
The company conducted comprehensive multi-omics profiling of the specimens—encompassing genomics, proteomics, metabolomics, and lipidomics—to capture a holistic view of human biology.
The resulting high-dimensional datasets were analyzed using their proprietary NAi Interrogative Biology® platform, which integrates Bayesian artificial intelligence learning algorithms to identify disease-specific biomarkers and therapeutic targets. This, arguably, led to the company’s quite successful clinical trial launches over the years.
Finally, a more recent but promising approach is how NOETIK is building their data foundation by sourcing curated human tumor specimens for its in-house biobank, applying stringent quality controls on parameters like ischemia time, necrosis percentage, and sample age, and ensuring each sample is pathologist-reviewed before inclusion.
The company generates multimodal datasets through advanced techniques such as spatial transcriptomics for single-cell RNA expression, whole exome sequencing for genomic alterations, and custom protein panels to map tumor-immune microenvironment interactions, all anchored by spatially randomized tissue microarrays to mitigate slide-level artifacts.

This vast data pipeline, coupled with their patent-pending processes, enables the creation of high-quality, self-supervised training datasets that power their AI engine OCTO, designed to model tumor biology and predict patient-specific therapeutic responses.
Validation is Critical for AI Drug Discovery Platforms
Finally, a central measure of credibility in AIDD is platform validation, which typically involves demonstrating tangible outcomes and reproducibility across diverse use cases.
Possible ways to validate a platform include a combination of the following:
(1) by advancing internal pipelines of novel therapeutics, where the AI engine is used support R&D team in discovering, designing, and optimizing lead molecules that progress through preclinical and, in some cases, clinical development.
(2) through partnerships with established pharmaceutical or biotech organizations, enabling third-party testing of the AI platform’s predictive power and generative capabilities on proprietary datasets. A track-record of public milestone announcements is critical.
(3) via public software demos or proof-of-concept studies published in peer reviewed journals, and patents.
(4) via regular publishing of AIDD case studies in peer reviewed journals
Below is a table showing historical pipeline growth dynamics for several notable companies frequently referenced in the AI-driven drug discovery space, including BenevolentAI, Healx, Insilico Medicine, Schrodinger, Relay Therapeutics, Recursion, Valo Health, Verge Genomics, and Exscientia which was acquired by Recursion in 2024:
Table 2 (the indicated data is for end of March 2025)
DISCLAIMER: Historical data sourced from archived public sources, including Webarchive service (see references 1-25), the data is NOT provided by the companies, and may have evolved. |
|||||||||||
——— | Program | Ownership | Indication | Target | 2019 | 2020 | 2021 | 2022 | 2023 | 2024 | 2025◔ |
---|---|---|---|---|---|---|---|---|---|---|---|
BEN-8744 | Whole | Ulcerative Colitis | PDE10 | Discovery | Preclinical | Phase 1 | Phase 1 | unknown | |||
BEN-28010 | Whole | Glioblastoma Multiforme | CHK1 | Discovery | Preclinical | Preclinical | Preclinical | unknown | |||
BEN-34712 | Whole | ALS | RARαβ | Discovery | Preclinical | Preclinical | unknown | ||||
- | Whole | Parkinson's disease | - | Discovery | Discovery | Discovery | unknown | ||||
- | Whole | Fibrosis | - | Discovery | Discovery | Discovery | unknown | ||||
- | Co-owner w/ AstraZeneca | Chronic Kidney Disease | - | Discovery | Discovery | unknown | |||||
- | Co-owner w/ AstraZeneca | Heart Failure | - | Discovery | unknown | ||||||
- | Co-owner w/ AstraZeneca | Systemic Lupus Erythematosus | - | Discovery | unknown | ||||||
- | Co-owner w/ Merck | Oncology | - | Discovery | Discovery | unknown | |||||
- | Co-owner w/ Merck | Neurology | - | Discovery | Discovery | unknown | |||||
- | Co-owner w/ Merck | Immunology | - | Discovery | Discovery | unknown | |||||
- | Co-owner w/ AstraZeneca | Idiopathic Pulmonary Fibrosis | - | Discovery | unknown | ||||||
BEN-2293 | Whole | Atopic Dermatitis | TrkA, TrkB, and TrkC | Discovery | Preclinical | Phase 1 | Phase 2 | Phase 2` | |||
BEN-9160 | Whole | ALS | Bcr-Abl | Discovery | Preclinical | unknown | |||||
- | Whole | Inflammatory Bowel disease (IBD) | - | Discovery | unknown | ||||||
- | Whole | Antiviral | - | Discovery | unknown | ||||||
- | Whole | Oncology | - | Discovery | unknown | ||||||
- | Whole | Oncology | - | Discovery | unknown | ||||||
- | Whole | NASH | - | Discovery | unknown | ||||||
- | Whole | Oncology | - | Discovery | unknown | ||||||
- | Whole | Parkinson's disease | - | Discovery | unknown | ||||||
- | Whole | Inflammation | - | Discovery | unknown | ||||||
HLX-1502 | - | Neurofibromatosis Type 1: plexiform/ cutaneous neurofibroma | - | Preclinical | Preclinical | Phase 1 | Phase 2 | ||||
HLX-1502 | - | Neurofibromatosis Type 2 | - | Preclinical | Preclinical | ||||||
HLX-0213 | - | Neurofibromatosis Type 1 | - | Preclinical | Preclinical | ||||||
HLX-0205 + HLX-0206 | - | Fragile X syndrome | - | Preclinical | Preclinical | Preclinical | Preclinical | ||||
HLX-0553 | - | Angelman syndrome | - | Preclinical | Preclinical | Preclinical | Preclinical | ||||
HLX-1066 | - | Autosomal dominant polycystic kidney disease (ADPKD) | - | Preclinical | Preclinical | Preclinical | unknown | ||||
- | - | Autosomal recessive polycystic kidney disease (ARPKD) | - | Preclinical | Preclinical | unknown | |||||
- | - | Autosomal Dominant Polycystic Liver Disease | - | Preclinical | Preclinical | unknown | |||||
- | - | Myotonic Dystrophy type-1 | - | Preclinical | unknown | ||||||
- | - | Autosomal Dominant Optic Atrophy | - | Preclinical | unknown | ||||||
HLX-2607 | - | Autosomal Dominant Polycystic Kidney Disease | - | Discovery | Preclinical | ||||||
- | - | Leber Hereditary Optic Neuropathy | - | Discovery | unknown | ||||||
- | - | Spinocerebellar Ataxia | - | Discovery | unknown | ||||||
- | - | Pseudoachondroplasia | - | Discovery | unknown | ||||||
- | - | Chronic pancreatitis | - | Preclinical | unknown | ||||||
- | - | Renal undisclosed disease | - | Preclinical | unknown | ||||||
- | - | Facioscapulohumeral muscular dystrophy (FSHD) | - | Preclinical | unknown | ||||||
- | - | COVID-19 | - | Preclinical | unknown | ||||||
- | - | Bone undisclosed disease | - | Preclinical | unknown | ||||||
- | - | Liver undisclosed disease | - | Preclinical | unknown | ||||||
- | - | Liver undisclosed disease | - | Preclinical | unknown | ||||||
- | - | Neuromuscular undisclosed disease | - | Preclinical | unknown | ||||||
INS018_055 | Whole | IPF | TNIK | Discovery | Preclinical | Phase 1 | Phase 2 | Phase 2 | Phase 2/3 | ||
ISM012 | Whole | Anemia of Chronic Kidney Disease | PHD1/2 | Discovery | Preclinical | Phase 1 | Phase 1 | Phase 1 | |||
ISM5411 | Whole | Inflammatory bowel disease (IBD) | PHD1/2 | Discovery | Preclinical | Phase 1 | Phase 1 | Phase 1 | |||
ISM8207 | Co-owner w/ Fosun | Immuno-oncology | QPCTL | Discovery | Preclinical | Phase 1 | Phase 1 | Phase 1 | |||
ISM3312 | - | COVID-19 | 3CLpro | Discovery | Preclinical | Phase 1 | Phase 1 | Phase 1 | |||
ISM3091 | Out-licensed, Exelixis | BRCA-mutant cancer | USP1 | Discovery | Preclinical | Phase 1 | Phase 1 | Phase 1 | |||
ISM5043 | Out-licensed, Menarini | ER+/HER2-breast cancer | KAT6 | Preclinical | Phase 1 | Phase 1 | |||||
- | Whole | Kidney fibrosis | TNIK | Discovery | Preclinical | Preclinical | Preclinical | Preclinical | |||
ISM3412 | Whole | MTAP-/-cancer | MAT2A | Discovery | Preclinical | Preclinical | Preclinical | Preclinical | |||
- | Whole | IPF (inhalable) | TNIK | Discovery | Discovery | Preclinical | Preclinical | Preclinical | |||
ISM9274 | Whole | Solid tumors | CDK12/13 | Discovery | Discovery | Preclinical | Preclinical | Preclinical | |||
ISM5939 | Whole | solid tumors | ENPP1 | Discovery | Discovery | Preclinical | Preclinical | Preclinical | |||
ISM4525 | Whole | Solid tumors | DGKA | Preclinical | Preclinical | Preclinical | |||||
ISM8001 | Whole | Solid tumors | FGFR2/3 | Preclinical | Preclinical | Preclinical | |||||
ISM6331 | Whole | Solid tumors | TEAD | Preclinical | Preclinical | Preclinical | |||||
- | Whole | Solid tumors | KIF18A | Preclinical | Preclinical | ||||||
ISM2196 | Whole | Solid tumors | WRN | Preclinical | Preclinical | ||||||
ISM027 | Whole | Solid tumors | cMYC | Discovery | Discovery | unknown | |||||
ISM016 | Whole | Gout flare | NLRP3 | Discovery | Discovery | Discovery | Preclinical | Preclinical | |||
ISM022 | Whole | AML, Solid tumors | CDK8 | Discovery | Discovery | unknown | |||||
ISM023 | Whole | Solid tumors | PARP7 | Discovery | Discovery | unknown | |||||
- | - | Skin Fibrosis | TNIK | Discovery | Discovery | unknown | |||||
- | Co-owner w/ Fosun | Diabetic Nephropathy, FSGS | - | Discovery | unknown | ||||||
REC-2282 | Whole | Neurofibromatosis Type 2 | HDAC | Preclinical | Phase 1 | Phase 1 | Phase 2 | Phase 2 | Phase 2 | ||
REC-4881 | Whole | Familial Adenomatous Polyposis | MEK1 and MEK2 | Preclinical | Phase 1 | Phase 1 | Phase 2 | Phase 2 | Phase 2 | ||
SYCAMORE / REC-994 | Whole | Cerebral Cavemous Malformation | antioxidant, no specific target | Preclinical | Phase 1 | Phase 1 | Phase 2 | Phase 2 | Phase 2 | ||
REC-4881 | Whole | AXIN1 or APC Mutant Cancers | MEK1 and MEK2 | Phase 1 | Phase 2 | ||||||
REC-3964 | Whole | Clostridium Difficile Colitis | C. difficile toxins | Discovery | Preclinical | Preclinical | Phase 1 | Phase 2 | Phase 2 | ||
REC-1245 | Whole | HR-proficient Ovarian Cancer RBM39 | RBM39 | Preclinical | Phase 1 | Phase 1 | |||||
Immunotherapy Target Epsilon | in-licensed from Bayer | Idiopathic Pulmonary Fibrosis | - | Preclinical | Preclinical | ||||||
- | Whole | Oncology | - | Discovery | Discovery | Discovery | unknown | ||||
Immunotherapy Target Alpha | Whole | Oncology | - | Discovery | Discovery | unknown | |||||
Immunotherapy Target Delta | Whole | - | - | Preclinical | Preclinical | ||||||
REC-3599 | Whole | GM2 Gangliosidosis | PKC and GSK3ß | Preclinical | Phase 1 | terminated | |||||
- | Whole | Immune Checkpoint resistance in STK11-NSCLC | - | Preclinical | Preclinical | unknown | |||||
- | - | Pulmonary Arterial Hypertension | - | Preclinical | unknown | ||||||
- | Whole | - | - | Preclinical | unknown | ||||||
- | Whole | Neuroinflammation | - | Discovery | Discovery | unknown | |||||
- | Whole | Charcot-Marie-Tooth Disease Type 2 | - | Discovery | Discovery | unknown | |||||
Immunotherapy Target Beta | Whole | Oncology | - | Discovery | unknown | ||||||
- | Whole | Hepatocellular Carcinoma | - | Discovery | unknown | ||||||
- | Whole | Batten Disease | - | Discovery | unknown | ||||||
REC-617 | Co-owner w/ Apeiron | Transcriptionally addicted cancers | CDK7 | Preclinical | Preclinical | Phase 1/2 | Phase 1/2 | Phase 1/2 | |||
EXS4318 | Out-licensed, BMS | inflammatory and immunologic diseases | PKC-theta | Preclinical | Preclinical | Preclinical | Phase 1 | Phase 1 | |||
REC-4539 | Whole | Oncology, AML, SCLC | LSD1 | Discovery | Discovery | Preclinical | Preclinical | Phase 1 | |||
REC-3565 | Whole | Oncology, Hematology | MALT1 | Discovery | Discovery | Preclinical | Preclinical | Phase 1 | |||
REV102 | Co-owner | Hypophosphatasia | ENPP1 | Discovery | Preclinical | Preclinical | Preclinical | Preclinical | |||
EXS21546 | Majority, w/ Evotec | High Adenosine Signature Cancers | A2aR | Preclinical | Phase 1 | Phase 1/2 | Phase 1/2` | ||||
- | Whole | COVID-19 | Mpro | Discovery | Preclinical | unknown | |||||
- | Whole | Inflammation and Immunity | NLRP3 | Discovery | Preclinical | unknown | |||||
- | Co-owner | Psychiatry | - | Discovery | Preclinical | unknown | |||||
- | Co-owner | Oncology | ENPP1 | Discovery | Preclinical | unknown | |||||
- | Co-owner | Oncology | - | Discovery | Discovery | unknown | |||||
- | Co-owner | Inflammation and immunity | - | Discovery | Discovery | unknown | |||||
- | Co-owner | Inflammation and Immunity | - | Discovery | Discovery | unknown | |||||
- | Co-owner | Oncology | - | Discovery | Discovery | unknown | |||||
- | Co-owner | Oncology | - | Discovery | Discovery | unknown | |||||
- | Whole | Immuno-Oncology | HPK1 | Discovery | unknown | ||||||
- | Whole | Oncology | - | Discovery | unknown | ||||||
- | Whole | Oncology | - | Discovery | unknown | ||||||
- | Whole | Oncology | - | Discovery | unknown | ||||||
- | Whole | Oncology | - | Discovery | unknown | ||||||
- | Whole | Anti-infective | - | Discovery | unknown | ||||||
RLY-4008 | Whole | FGFR2-altered cholangiocarcinoma (CCA) | FGFR2 (mutant+WT) | Discovery | Phase 1 | Phase 1 | Phase 1 | Phase 1/2 | Phase 1/2 | Phase 1/2 | |
RLY-2608 monotherapy | Whole | Breast cancer and solid tumors | PI3Kα | Phase 1 | Phase 1 | Phase 1 | Phase 1 | ||||
RLY-2608 | Whole | Vascular malformations | PI3Kα | Preclinical | Preclinical | ||||||
RLY-1013 (degrader) | Whole | Breast Cancer | ERα | Discovery | Preclinical | Preclinical | |||||
NRAS | Whole | melanoma, colorectal and non-small-cell lung | NRAS | Preclinical | Preclinical | ||||||
αGal Chaperone | Whole | Fabry disease | αGal | Preclinical | Preclinical | ||||||
RLV-PI3K1047 (RLY-5836) | Whole | - | PI3Kα | Discovery | Preclinical | Phase 1 | unknown | ||||
RLY-2139 | Whole | Oncology | CDK2 | Discovery | Discovery | Preclinical | paused | ||||
GDC-1971 | Co-owner w/ Genentech | Cancers, expand into multiple combination | SHP2 | Preclinical | Phase 1 | Phase 1 | Phase 1 | Phase 1 | Phase 1 | Phase 1 | |
- | Whole | - | PI3Kα | Discovery | unknown | ||||||
- | Whole | Oncology | - | Discovery | Discovery | unknown | |||||
- | Whole | Oncology | - | Discovery | Discovery | unknown | |||||
- | Whole | Genetic disease | - | Discovery | Discovery | unknown | |||||
- | Whole | Genetic disease | - | Discovery | Discovery | unknown | |||||
SGR-1505 | Whole | Relapsed or refractory B-cell lymphoma, chronic lymphocytic leukemia | MALT1 | Discovery | Discovery | Preclinical | Phase 1 | Phase 1 | Phase 1 | Phase 1 | |
SGR-2921 | Whole | Hematological cancers and solid tumors | CDC7 | Preclinical | Phase 1 | Phase 1 | Phase 1 | ||||
SGR-3515 | Whole | Solid tumors | WEE1/MYT1 | Discovery | Preclinical | Phase 1 | Phase 1 | ||||
SDGR5 | Whole | KRAS-driven Cancers | SOS1 | Discovery | Discovery | Discovery | Preclinical | Preclinical | Preclinical | ||
- | Whole | Neurology | LRRK2 | Discovery | Discovery | Discovery | Discovery | ||||
- | Whole | Oncology | PRMT5-MTA | Discovery | Discovery | Discovery | Discovery | ||||
- | Whole | Oncology | EFGR(C797S) | Discovery | Discovery | Discovery | Discovery | ||||
- | Whole | Immunology | NLRP3 | Discovery | Discovery | Discovery | Discovery | ||||
- | Whole | Oncology | - | Discovery | Discovery | Unknown | |||||
- | Whole | Oncology | - | Discovery | Discovery | Unknown | |||||
- | Whole | Immunology | - | Discovery | Discovery | Unknown | |||||
SDGR1 | Whole | Esophageal and Lung Cancers, | CDC7 | Discovery | Discovery | Discovery | unknown | ||||
SDGR2 | Whole | Ovarian, Pancreatic, Breast and Lung Cancers | WEE1 | Discovery | Discovery | Discovery | unknown | ||||
TAK-279 | Co-owner w/ Takeda | Psoriasis | TYK2 | Phase 2 | Phase 3 | Phase 3 | |||||
- | Gilead | NASH | ACC | Phase 2 | Phase 2 | Phase 2 | Phase 2 | ||||
MORF-057 | Lilly | Inflammatory bowel diseases | α4β7 | Phase 2 | Phase 2 | Phase 2 | Phase 2 | ||||
- | Co-owner w/ Nimbus Therapeutic | Immuno-oncology | HPK1 | Phase 1/1 | Phase 1/2 | Phase 1/2 | |||||
- | Co-owner w/ Structure Therapeutics | Pulmonary arterial hypertension | APJR | Phase 1 | Phase 1 | Phase 1 | |||||
- | Structure Therapeutics | Idiopathic pulmonary fibrosis | LPA1R | Preclinical | Preclinical | Preclinical | |||||
- | Co-owner w/ Ajax | Oncology | JAK2 | Discovery | Preclinical | Preclinical | |||||
- | BMS | Neurology | - | Discovery | Discovery | Discovery | |||||
- | Collab. w/ BMS | Oncology, Immunology, Neurology | - | Discovery | Discovery | Discovery | Discovery | ||||
- | Co-owner w/ Takeda | Oncology | - | Discovery | Discovery | Discovery | Discovery | ||||
- | Co-owner w/ Lilly | Immunology | - | Discovery | Discovery | Discovery | Discovery | ||||
- | Lilly | Pulmonary arterial hypertension | - | Discovery | Discovery | Discovery | Discovery | ||||
- | Lilly | Solid tumors, fibrosis | αvβ8 | Discovery | Discovery | Discovery | Discovery | ||||
- | Lilly | GI indications | α4β7 | Discovery | Discovery | Discovery | Discovery | ||||
- | Co-owner w/ Bright Angel Therapeutics | Antifungal | HSP90 | Discovery | Discovery | Discovery | |||||
- | Structure Therapeutics | - | - | Discovery | Discovery | Discovery | |||||
- | Otsuka | CNS | - | Discovery | Discovery | Discovery | Discovery | ||||
- | Co-owner w/ Loxo Therapeutics | oncology | - | Phase 1 | unknown | ||||||
- | Co-owner w/ BMS | immunology | - | Discovery | unknown | ||||||
- | Co-owner w/ Sanofi | oncology | - | Discovery | unknown | ||||||
- | Co-owner w/ BMS | Oncology | - | Discovery | unknown | ||||||
- | Co-owner w/ BMS | Oncology | - | Discovery | unknown | ||||||
- | Co-owner w/ BMS | Immunology | - | Discovery | unknown | ||||||
- | Co-owner w/ Zai Lab | Oncology | - | Discovery | unknown | ||||||
SDGR4 | Co-owner w/ BMS | Renal Cell Carcinoma | HIF-2a | Discovery | Discovery | unknown | |||||
- | Co-owner w/ BMS | Oncology, Immunology, Neurology | - | Discovery | unknown | ||||||
VRG50635 | Co-developer w/ Ferrer | ALS | PIKfyve | Discovery | Preclinical | Phase 1 | Phase 1 | Phase 1 | Phase 1 | ||
VRG201 | Whole | Obesity | CD38 | Preclinical | Preclinical | ||||||
VRG201 | Whole | Metabolic Syndrome | CD38 | Preclinical | Preclinical | ||||||
- | Whole | Alzheimer disease / Parkinson's Disease | PIKfyve | Discovery | Discovery | Discovery | Discovery | ||||
- | Whole | Neurodegenerative Diseases | CD38 | Discovery | Discovery | Discovery | Discovery | ||||
- | Whole | Peripheral | PIKfyve | Discovery | Discovery | ||||||
- | Whole | Schizophrenia | - | Discovery | Discovery | Discovery | Discovery | ||||
- | Whole | Frontotemporal Dementia | - | Discovery | Discovery | Discovery | Discovery | ||||
- | Whole | Progressive Supranuclear Palsy | - | Discovery | Discovery | Discovery | Discovery | ||||
- | - | Crohn's Disease | - | Discovery | Discovery | Discovery | |||||
- | - | Ulcerative Colitis | - | Discovery | Discovery | Discovery | |||||
- | - | Psoriasis | - | Discovery | Discovery | Discovery | |||||
- | - | Lewy Body Dementia | - | Discovery | Discovery | ||||||
- | - | Friedreich’s Ataxia | - | Discovery | Discovery | ||||||
- | - | Myotonic Dystrophy 1 | - | Discovery | Discovery | ||||||
- | - | Picks Disease | - | Discovery | Discovery | ||||||
Partnered Programs | Co-owner w/ Lilly | ALS | - | Discovery | Discovery | Discovery | |||||
Partnered Programs | Co-owner w/ Lilly | ALS | - | Discovery | Discovery | Discovery | |||||
- | - | Atopic Dermatitis | - | Discovery | unknown | ||||||
Partnered Programs | Co-owner w/ Alexion | Neurodegenerative Diseases | - | Discovery | unknown | ||||||
Partnered Programs | Co-owner w/ Alexion | Neuromuscular Diseases | - | Discovery | unknown | ||||||
- | Whole | COVID-19 | PIKfyve | Discovery | Preclinical | unknown | |||||
- | Whole | Undisclosed | - | Discovery | unknown | ||||||
- | Whole | Parkinson's Disease | - | Discovery | unknown | ||||||
- | Whole | Parkinson's Disease | - | Discovery | unknown | ||||||
OPL-0301 | - | Heart failure and Acute Kidney Injury | S1P1 agonist | Phase 1 | Phase 2 | Phase 2 | unknown | ||||
OPL-0401 | - | Diabetic Retinopathy | ROCK 1/2 inhibitor | Phase 1 | Phase 2 | Phase 2 | Phase 2` | ||||
OPAL-0022 | - | Atherosclerosis | - | Discovery | unknown | ||||||
OPAL-0004 | - | Atherosclerosis, Glioblastoma | - | Discovery | unknown | ||||||
OPAL-0018 | - | Atherosclerosis | - | Discovery | unknown | ||||||
OPAL-0003 | - | Heart Failure, Glioblastoma | - | Discovery | unknown | ||||||
OPL-0101 | - | Immuno-Oncology | - | Discovery | Preclinical | unknown | |||||
OPAL-0021 | - | cancer | - | Discovery | unknown | ||||||
OPAL-0015 | - | NSCLC, Squamous Cell Carcinoma, Targeted Defined Tumors | USP28 | Discovery | unknown | ||||||
OPAL-0024 | - | Solid Tumors | - | Discovery | unknown | ||||||
OPAL-0001 | - | Medulla/Glioblastoma Brain Tumors, Breast Cancer | PARP1 | Discovery | unknown | ||||||
OPAL-0014 | - | Pancreatic Ductal Adenocarcinoma (PDAC), Targeted Defined Tumors | - | Discovery | unknown | ||||||
OPAL-0023 | - | Defined Tumors, Immune Modulation | - | Discovery | unknown | ||||||
OPAL-0012 | - | NSCLC | USP7 | Discovery | unknown | ||||||
OPAL-0016 | - | Induced Neuropathy and Cardiomyopathy | - | Discovery | unknown | ||||||
OPAL-0002 | - | Neurodegenerative disorders | - | Discovery | unknown | ||||||
OPAL-0006 | - | Neurodegenerative: Oncology (metastatic) | - | Discovery | unknown |
Commenting on the above data from the table and the infographics, Insilico Medicine has shown notable pipeline growth over the past five years. Specifically, the company has launched 31 therapeutic programs targeting diverse indications, with 22 preclinical candidates nominated from 2021 and nine more in 2022, and a total of 10 pipelines receiving IND approval. At present, the leading Insilico program for idiopathic pulmonary fibrosis (IPF) was discovered from concept to Phase I trials in just under 30 months and is now in Phase 2 clinical trials in both the United States and China, while five other programs are in Phase 1.
Additionally, several clinical assets have been out-licensed or co-developed with third parties, including recent milestone payment. These accomplishments appear to highlight a significant productivity boost, yet the question remains whether they definitively prove that AI technologies — rather than more conventional R&D structures and partnerships — are the key drivers behind this rapid expansion.
A similar phenomenon is observed with Schrodinger’s platform, which, although not explicitly marketed as “AI,” has enabled significant pipeline development. While software capabilities can help streamline decision-making — such as accelerating target identification and optimizing lead compounds — it is not a simple matter of “printing” successful molecules. Determining the true impact of AI requires assessing the extent to which such platforms reduce discovery cycles or increase success rates in a statistically meaningful way. One potentially stronger validation approach is to examine how many third-party organizations license and effectively use these tools for drug development, thereby providing external feedback and real-world performance benchmarks.
Beyond just the number of drug candidates in the pipelines of AI companies, it is interesting to look at the target novelty landscape of some of the well-known AI players:

Speed and Cost of Drug Development
The timelines of nominating preclinical candidates from start to IND, reported by some AI-driven drug discovery companies over the last several years, suggest a seemingly accelerated path, compared to known industry averages.
For instance, companies like Insilico Medicine, Recursion, and Exscientia have compressed the discovery phase from the industry-standard 2.5 to 4 years (40-50 months) down to 9 to 18 months in some cases.
According to a recently published benchmark, Insilico Medicine averages 12-18 months per program, testing only 60-200 molecules, while Recursion advances candidates in 18 months with fewer than 200 molecules per program. Exscientia, which merged with Recursion, claims to have shortened its timeline from four to five years to just 12 to 18 months, screening 150-250 molecules — a notable contrast to traditional methods that sometimes require testing 3,000-5,000 molecules per program.
Company | Discovery Timelines | Programs |
---|---|---|
Benevolent AI | Around 24 months in the case of BEN‑8744 | A small molecule PDE10 inhibitor for UC treatment |
Evaxion | Around 12 months for EVX‑01 | Neoantigen vaccine EVX-01 for metastatic melanoma |
Exscientia (merged with Recursion) |
Around 11 months for EXS4318
~12-18 months on average |
EXS4318 (PKC-theta inhibitor) for inflammatory and immunologic diseases,
On average, 150-200 mols. per program. |
Iambic Therapeutics | Around 8 months for IAM1363 | A small molecule for the treatment of HER2-altered cancers |
Insilico Medicine | ~12 months on average across 22 preclinical candidates |
Programs in Idiopathic Pulmonary Fibrosis (IPF), Inflammatory bowel disease (IBD), Immuno-oncology, COVID-19, and other
On average, 60-200 mols. per program |
Nimbus Therapeutics | Around 56 months for NDI‑034858 | NDI-034858 is an allosteric TYK2 inhibitor for the treatment of multiple autoimmune diseases |
Recursion | At least 18 months in the case of REC‑1245 | Recursion's REC-1245 RBM39 degrader for solid tumors and lymphoma <200 mols. per program |
Relay | Around 48 months for RLY‑4008 | FGFR2-specific inhibitor RLY-4008 for cholangiocarcinoma |
Schrodinger | Over 24 months for SGR‑3515 | A Type 1 kinase inhibitor oncology project |
Traditional approaches | 2.5-4 years (40-50 months) | 3000-5000 mols. per program |
However, these seemingly accelerated timelines have yet to fully translate into clinical success. While some AI-developed drugs have progressed into ongoing clinical trials — such as those from Insilico Medicine, Iambic, and Recursion — there are a number of failed clinical trials or discontinuations for strategic reasons. Some examples are discussed in our 2024 report “It’s Been a Decade of AI in the Drug Discovery Race. What’s Next?”
Although AI-based computational tools could predict promising candidates faster (according to various claims), it does not guarantee that these drugs will be clinically viable, effective, or safe. The reduced number of molecules screened in AI-driven programs may also pose risks, as narrowing the search space too aggressively could lead to overlooked liabilities that emerge later in clinical development.
However, apart from timelines, the important parameter is also cost. While this study is not looking into the cost structure of such programs, Insilico Medicine once reported that some of their AI-designed drug candidates were discovered at around 10% of a “conventional” program cost, a claim we did not specifically validate.
Pragmatic Recommendations for AIDD Investors and Stakeholders:
Look Beyond Candidate Counts: Merely tallying the number of pipeline assets does not capture the incremental value AI platforms may provide. Faster program initiation or more accurate attrition rates, for instance, could be more telling indicators.
Evaluate Decision-Making Efficiency: Pinpoint where AI significantly shortens R&D workflows — e.g., by expediting hit-to-lead stages or improving target validation, or supporting more efficient clinical trial protocol design.
Scrutinize External Adoption: Seek third-party evidence of productivity gains, such as collaboration announcements, successful milestones, or continued software licensing agreements. Tools that are openly licensed or sold commercially allow for real competitive benchmarking.
Consider Contextual Factors: Keep in mind that corporate strategy, funding, and existing R&D infrastructure often play major roles in pipeline output. It is not always possible to isolate AI’s contribution without analyzing these concurrent influences. In fact, it is quite the opposite: it is almost impossible to calculate the actual impact of AI algorithms on the actual drug development process.
Grand Vision of AI Drug Discovery for 2025 and Beyond
Having reviewed many claimed AI drug discovery companies over a decade of progress, hype and facing reality (like failed clinical trials of arguably AI-designed drug candidates), it is time to define the goal and method of AI drug discovery, and accept that the overwhelming majority of companies are not there yet.
The entire idea of the AIDD movement is, in fact, not about improving existing drug discovery processes, like structure-based drug discovery or virtual screening via using better models, advanced machine learning etc.
Obviously, it helps and most of the companies in this business are doing it. Using better models for screening or docking could have marginal improvements of research processes, but does not change the fundamental problem of drug discovery: poor translation of hypotheses to clinical results and high degree of clinical failures due to unexpected toxicity or poor efficacy in a (sometimes, poorly) selected patient subpopulation.
The novelty and ambition of the AIDD approach is about redesigning the existing mainstream drug discovery paradigm into something different.
We suggest calling it “Holistic Drug Development (HDD).”
Starting from modeling the entirety of real-world data about patients (coming from specimens, analytical samples, EHRs, and other biomedical data), and taking into account all available preclinical data and experience, and building the path down to a relevant underlying hypothesis on a molecular level. And then, walking that path in reverse — from the newly discovered hypothesis, via drug design and development, back to the patient. Hopefully, with the improved probability of success. We believe we are still years away from this reality, but a number of companies are already building pieces of the puzzle of the industrialized research workflow of the future.
Time will tell if AIDD proves to be the better way to achieve HDD vision, we are cautiously optimistic...
References
1. Schrodinger, pipeline, December 2023 https://www.schrodinger.com/pipeline
2. Schrodinger, pipeline, November 2022 https://web.archive.org/web/20221124124721/https://www.schrodinger.com/pipeline
3. Schrodinger, pipeline, June 2021 https://web.archive.org/web/20210620183431/https://www.schrodinger.com/pipeline
4. Schrodinger, pipeline, June 2020 https://web.archive.org/web/20200606152921/https://www.schrodinger.com/pipeline
5. Schrodinger, pipeline, July 2019 https://web.archive.org/web/20190717045358/https://www.schrodinger.com/pipeline
6. Recursion, pipeline, December 2023 https://www.recursion.com/pipeline
7. Recursion, pipeline, January 2022 https://web.archive.org/web/20220131104947/https://www.recursion.com/pipeline
8. Recursion, pipeline, February 2021 https://web.archive.org/web/20210225041638/https://www.recursion.com/pipeline
9. Recursion, pipeline, January 2021 https://web.archive.org/web/20210129043831/https://www.recursion.com/pipeline
10. Exscientia, pipeline, November 2023 https://web.archive.org/web/20231130165922/https://www.exscientia.ai/pipeline
11. Exscientia, PR, August 2022 https://www.businesswire.com/news/home/20220817005681/en/Exscientia-Business-Update-for-Second-Quarter-and-First-Half-2022
12. Exscientia, article, July 2022 https://www.nanalyze.com/2022/07/exscientia-stock-ai-drug-discovery/
13. Exscientia, annual report, 2021 https://s28.q4cdn.com/460399462/files/doc_financials/2021/ar/2021-UK-Annual-Report.pdf
14. Relay, pipeline, November 2023 https://web.archive.org/web/20231111223956/https://relaytx.com/pipeline/
15. Relay, annual report (PDF), 2022 https://ir.relaytx.com/static-files/1b13dc48-4fb1-4ec3-b639-69636bc3ace1
16. Relay, annual report (PDF), 2021 https://ir.relaytx.com/static-files/65cffc5e-e6e3-42a3-9b87-cc44b93c2856
17. Relay, annual report (PDF), 2020 https://ir.relaytx.com/static-files/08d959ca-abd2-4a9c-bd25-be8eef73d732
18. BenevolentAI, pipeline, December 2023 https://web.archive.org/web/20231205114116/https://www.benevolent.com/pipeline/
19. BenevolentAI, annual report (PDF), 2022 https://www.benevolent.com/application/files/9816/7939/1282/BenevolentAI_Annual_Report_2022.pdf
20. Insilico, pipeline, December 2023 https://web.archive.org/web/20231204133620/https://insilico.com/pipeline
21. Insilico, pipeline, October 2022 https://web.archive.org/web/20221007131323/https://insilico.com/pipeline
22. Insilico, pipeline, February 2022 https://web.archive.org/web/20220213125657/https://insilico.com/pipeline
23. Verge Genomics, pipeline, February 2024 https://web.archive.org/web/20240306224636/https://www.vergegenomics.com/pipeline
24. Verge Genomics, pipeline, November 2022 https://web.archive.org/web/20221104085232/https://www.vergegenomics.com/pipeline
25. BenevolentAI, report release, 2021 and 2022 https://www.benevolent.com/news-and-media/press-releases-and-in-media/benevolentai-unaudited-preliminary-results-year-ended-31-december-2022/
26. BenevolentAI, report release, 2023 https://www.benevolent.com/application/files/2417/1136/4663/BenevolentAI_Annual_Report_2023.pdf
27. Exscientia, annual report (PDF), 2022 https://s28.q4cdn.com/460399462/files/doc_financials/2022/ar/EXAI_FY22-AR_final_compressed.pdf
28. Exscientia, annual report (PDF), 2023 https://d18rn0p25nwr6d.cloudfront.net/CIK-0001865408/2e2ce7ec-55dc-4fe6-afa6-d0437f22ada4.pdf
29. Recursion, annual report (HTML), 2021 https://ir.recursion.com/node/6926/html
30. Recursion, annual report (HTML), 2022 https://ir.recursion.com/node/8131/html
31. Recursion, annual report (HTML), 2023 https://ir.recursion.com/node/9691/html
32. Relay, annual report (HTML), 2021 https://ir.relaytx.com/node/7691/html
33. Relay, annual report (HTML), 2022 https://ir.relaytx.com/node/8531/html
34. Relay, annual report (HTML), 2023 https://ir.relaytx.com/node/9196/html
35. Schrodinger, annual report (PDF), 2021 https://d18rn0p25nwr6d.cloudfront.net/CIK-0001490978/7a72e457-9a9e-4efc-b9b3-5ead018c904d.pdf
36. Schrodinger, annual report (PDF), 2022 https://d18rn0p25nwr6d.cloudfront.net/CIK-0001490978/6835c32b-f977-482f-82c5-254066f66d06.pdf
37. Schrodinger, annual report (PDF), 2023 https://d18rn0p25nwr6d.cloudfront.net/CIK-0001490978/b3224b2d-5cc5-4081-ba8b-d89a31181139.pdf
38. XtalPi, report, 2021, 2022, 2023Q2 https://www1.hkexnews.hk/app/sehk/2023/105964/documents/sehk23113001942.pdf
39. Insilico, annual report (PDF), 2023 https://www1.hkexnews.hk/app/sehk/2024/106323/documents/sehk24032702892.pdf
Edits
- Edit 1 (2025-04-17): Following a clarification from Iambic representatives, we have updated the Iambic timeline in the Table 3, replacing 24 months for 8 months. The company explains that 24 months is for getting to clinic, while it took only 8 months to get to IND studies.
Report methodology
An analysis of historical therapeutic pipeline data (Table 2) was carried out using archived snapshots from the Web Archive, allowing us to review how pipeline diagrams appeared at earlier points in time. In some instances, annual financial reports were also consulted to retrieve pipeline details for previous years.
Efforts were made to track each molecule or program within a given pipeline across successive years, and if a particular program did not appear in the following year’s records, it was generally assumed that it had been put on hold for various reasons.
Target novelty analysis for Diagram 3 was performed based on the methodology and mathematical formula outlined in this file.
Correction policy
If you come across any factual inaccuracies or outdated information, please don’t hesitate to contact us promptly. We will address these issues by issuing corrections in a dedicated section of our report, pending editorial review.
This correction policy covers company profiles, technology evaluations, and all comparative analyses included in our report. Stakeholders are encouraged to report potential errors to our editorial team using this form.
All corrections will be clearly dated and thoroughly detailed to uphold the integrity of our comparative report and ensure our readers have access to the most accurate and up-to-date information.
Disclaimer
This report aims to provide an educational, balanced, and pragmatic perspective on AI-driven drug discovery (AIDD). No part of this report should be construed as promotional content or marketing communication.
Some companies featured are past or current clients, and certain organizations provided factual input during the research process. All analysis and conclusions were developed independently to ensure objectivity.
This report does not constitute investment advice or an endorsement. While we strive for accuracy and neutrality, we accept no liability for decisions made based on this content. Readers are encouraged to conduct their own due diligence.