AI in Drug Discovery: Most Underdiscussed Frontier in Capital Markets
Pharmaceutical research remains the largest, slowest, and least productive industrial sector in the developed world. The industry spends roughly $150 billion every year trying to bring new drugs to market, yet the return on that capital has been deteriorating for seventy years.
The numbers are stark. A single new drug still takes 10–12 years and $2–3 billion to develop. Approximately 90% of candidates that enter human trials fail before approval. Adjusted for inflation, the number of new drugs approved per dollar of R&D spend has fallen by a factor of roughly 80 since 1950. The industry even has a name for this: Eroom’s Law: Moore’s Law spelled backwards.
This is the setup: a century-old, capital-intensive industry sitting on the largest biological dataset ever assembled, with productivity in structural decline and patent cliffs accelerating. It is the textbook profile of an industry waiting to be restructured.
The core argument is that artificial intelligence is now the mechanism beginning to drive that restructuring and that the value created over the next decade will be substantial, asymmetric, and unevenly distributed across the stack. Investors who understand where value accrues and what validation to wait for have one of the more compelling long-duration opportunities in public and private markets today.

What AI Actually Changes
Drug development is a brutal sequential filter. Roughly 10,000–20,000 starting compounds are narrowed to a single approved drug, with costs compounding dramatically as candidates move from discovery through preclinical work and into expensive late-stage trials.

AI changes this equation in three fundamental ways:
1. Target Discovery
Modern machine learning can integrate genomic, proteomic, transcriptomic, and clinical data at a scale no human team can match. It surfaces which biological mechanisms are most likely to be both disease-relevant and chemically tractable, dramatically expanding the addressable universe of potential drugs.
2. Molecule Design
DeepMind’s AlphaFold (and its successors) fundamentally altered what is possible. Where chemists once spent months crystallizing proteins to understand their shape, they can now query predicted structures for over 200 million proteins. Newer foundation models go further: designing novel molecules from scratch, predicting toxicity and ADME properties in silico, and proposing antibodies or other modalities with desired characteristics before any wet-lab work begins.

3. Clinical Trials & Development
Trials are the single largest cost center in pharma R&D. AI can improve patient recruitment and stratification, reduce dropout rates, detect earlier efficacy or safety signals, and construct synthetic control arms from real-world data. Industry estimates suggest that AI, combined with reduced animal testing and better preclinical modeling, could cut preclinical costs and timelines by up to 50%.
The aggregate effect, if realized, is transformative. Cutting average development time by even a few years while meaningfully lifting success rates would unlock hundreds of billions in annual economic value. That is not incremental improvement, it is industry restructuring.

The Stack
For investment purposes, the AI drug discovery ecosystem is best understood as a stack, much like the internet or broader AI infrastructure layers. Different layers carry different risk profiles, defensibility, and return characteristics.
- Compute Infrastructure (Base Layer): Training large biological models and running molecular dynamics simulations demands massive GPU resources. Roche’s AI factory with thousands of NVIDIA GPUs is one example among many. NVIDIA, major cloud providers, and specialized HPC players capture this layer regardless of which therapeutic platform ultimately wins.

- Data Infrastructure: AI is only as good as its training data. Biological data remains fragmented and often proprietary. Companies that own large, curated, well-annotated datasets — from sequencing platforms, high-throughput screens, or patient registries, possess one of the few durable moats. Illumina, 10x Genomics, IQVIA, and various private data holders sit here.
- Foundation Models: AlphaFold, ESM-3, and similar large biological models are increasingly open-sourced and heading toward commoditization. The real question is whether companies that fine-tune these models on proprietary data can maintain a defensible edge.
- AI-Native Drug Discovery Platforms: This is the most visible and volatile layer. Companies like Recursion, Schrödinger, Relay Therapeutics, AbCellera, Absci, and Certara operate here (alongside many private players). Some function primarily as software vendors; others are building their own pipelines while monetizing partnerships. Most are hybrids.
- Pharma Incumbents (Top Layer): Every major pharma company now has AI partnerships, internal teams, or both. Roche, Eli Lilly, Novartis, and AstraZeneca are investing aggressively. The strategic question is whether incumbents can internalize AI capabilities fast enough or whether platform companies will capture more of the value chain.
Where the Value Will Accrue
This is the most important and least settled question for investors.
Infrastructure (especially compute and data) offers the cleanest near-term exposure. NVIDIA captures spend no matter which platform succeeds, analogous to its role across the broader generative AI buildout. This is the classic “picks and shovels” trade with the lowest scientific risk.
AI-native platform companies are where asymmetric upside (and binary outcomes) live. A small number will produce approved drugs and become large independent businesses or attractive acquisition targets. Many will not. As of now, no fully AI-designed drug has received FDA approval, though several have entered the clinic and shown early signals (including Recursion’s REC-4881). Investors here are underwriting both technology & biology and biology remains the harder problem.
Software and services companies (e.g., Schrödinger, Certara) offer a middle path: recurring revenue from pharma customers tied to overall AI adoption rather than any single trial outcome. These tend to be steadier compounders.
Big Pharma is the most conservative way to gain exposure. Productivity gains flow through to pipelines but are diluted across enormous existing portfolios.
The Bull Case
In the strongest scenario, AI does to pharma what cloud computing did to enterprise software. Discovery and development cycles compress dramatically. Success rates rise meaningfully. New modalities, including personalized vaccines and machine-learning-optimized gene therapies, become routine. The economics shift from blockbuster monocultures to faster, broader, more personalized portfolios.
Successful AI-native platforms would not merely sell tools; they would become the new operating system of the industry. Some would be acquired at significant premiums; others would scale into large standalone biotechs. The companies that combine proprietary data, foundation models, and closed-loop wet-lab automation could emerge as the next generation of pharma giants.
The Bear Case
The bear case is not that AI achieves nothing. It is that AI optimizes the easier parts of drug development while the hardest part, correct biological hypothesis selection, remains largely unsolved.
Most late-stage failures occur because the underlying target or mechanism proves irrelevant or has unacceptable off-target effects in humans. AI excels at optimization within a given biological model. It is not yet clear it can systematically generate better hypotheses. If true, timelines and early costs may compress, but late-stage attrition could stay stubbornly high. The economics improve, but not enough to justify current platform valuations.
Additional risks include noisy or limited biological datasets, model overfitting, commoditization of open-source models, and the very real possibility that large pharma simply builds superior internal AI capabilities and bypasses the platform layer.
History offers precedent: the genomics revolution of the early 2000s promised personalized medicine within a decade. The science was real, but the timeline was off by a generation.
What to Watch (Next 12–36 Months)
A handful of catalysts will disproportionately shape conviction:
- Clinical proof points — Phase II readouts from AI-designed or AI-optimized compounds showing efficacy signals competitive with or superior to traditional peers.
- Unusual deal terms — Multi-billion-dollar pharma partnerships that transfer meaningful economics to AI platforms.
- Regulatory signals — FDA acceptance of AI-derived elements (e.g., synthetic controls) in registrational trials or approval of a drug where AI played a prominent discovery role.
- Aggregate productivity data — Whether the industry as a whole begins approving more drugs, faster, and at lower average cost. This will emerge gradually through the second half of the decade.
Positioning for Investors
A defensible approach overweights the layers with clearest visibility while maintaining measured optionality on platforms:
- Majority allocation to compute and data infrastructure (NVIDIA as anchor, alongside cloud providers and select genomics/data companies).
- Smaller allocation to validated AI-native platforms with either clinical assets or substantial pharma partnerships (public examples: Recursion, Schrödinger, AbCellera; notable private: Isomorphic Labs).
- Targeted exposure to incumbents moving fastest on AI integration (Roche and Lilly stand out).
This is not a single-ticker thesis. It is a basket approach, weighted toward the layers with the cleanest risk/reward today, with asymmetric upside on the platforms that could become the next pharma giants if the science delivers.

The Bottom Line
The strongest case for AI in drug discovery is not the elegance of the technology. It is that the underlying industry is structurally broken in ways AI is uniquely positioned to address: decades of declining R&D productivity, a $150 billion annual spend with ~90% failure rates, and an exploding base of biological data that humans cannot process at scale.
The thesis is conditional. The science must work and clinical results must follow. But if even a meaningful fraction of the modeled productivity gains materialize, the value created over the next decade will rank among the more significant capital allocation stories of the era.
Investors waiting for full validation will pay a premium. Investors allocating today must be selective about where in the stack they sit, patient on timelines, and disciplined about which platforms they back.
This is a frontier worth watching closely and allocating to thoughtfully.
This brief is for informational purposes only and does not constitute investment advice. The author is not a registered investment advisor. Readers should conduct their own due diligence and consult qualified professionals before making any investment decisions.