Foundational Models for AI in Biology

Topic:

A paradigm shift in genomics, transcriptomics and proteomics

Summary: Artificial Intelligence is transforming biological research, opening up new frontiers in genomics, transcriptomics and proteomics. From decoding DNA sequences to designing new proteins, AI-based models are accelerating discovery, predicting mutations and integrating complex biological data like never before.

The next era of biological discovery is being driven by a new class of artificial intelligence: foundation models. Unlike earlier AI systems trained for narrow, single-purpose tasks, foundation models are built on transformer architectures and large language models (LLMs), enabling them to learn from vast amounts of unlabeled data and adapt across multiple applications with minimal fine-tuning.

In drug discovery, this marks a shift from traditional, reductionist approaches—where researchers pursued a linear path from single-target identification to drug development – toward a more integrated, systems-level understanding of biology. These models can process and draw connections across genomic, transcriptomic and proteomic datasets, making it possible to decode the complexity of diseases with unprecedented precision.

A new generation of biological models takes discovery to the next level

Foundation models, which are constantly improving by learning from vast, complex biological datasets, are capable of recognizing patterns in DNA, RNA and protein sequences, predicting structural and functional changes and generating entirely new biological designs.

In genomics and transcriptomics, models such as including Evo2, DNABERT-2, Nucleotide Transformer, Geneforme, scFoundation, scGPT, RNA-FM (not publicly available yet), and  RNABERT are advancing our ability to interpret genetic code, understand gene expression, and explore cell-specific biology at scale.

On the proteomics front, tools like AlphaFold, AlphaMissense (which extends AlphaFold to predict the pathogenicity of missense mutations using structural insights), ESMFold, ProtGPT2, and UniProtBERT are pushing the boundaries of protein structure prediction, function annotation and de novo protein design.

What’s next for genomic, transcriptomic and proteomic models?

By enabling multi-omics integration, data-driven discovery and generative design, foundation models are transforming the way we approach drug discovery and disease research. Rather than focusing on isolated targets, researchers can now map and manipulate entire biological networks, enabling a more holistic understanding of diseases and opening new possibilities for treating complex diseases such as cancer and Alzheimer’s.

As the implementation of these tools increases, foundation models will usher in a paradigm shift in fields of personalized medicine, precision healthcare, and complex disease modeling. This AI-powered transformation will not only to accelerate timelines and reduce costs but fundamentally reshape our understanding of human biology—unlocking innovations across biotechnology and biomedical applications.

Sign up for our newsletter (section below) to stay up to date with the latest news and insights!

Further Reading:

Expert Contribution

The use of foundation-model AI for biological research — especially across genomics, transcriptomics and proteomics — is strengthened by the contributions of Dawid Rymarczyk, whose deep knowledge in AI-driven biology helps ensure our models remain scientifically robust. Their expertise guides how we apply large-scale biological data and state-of-the-art AI frameworks to enable accurate insights into gene expression, protein structure and multi-omics integration.

You might be also interested in:

Blog cover for Ardigen publication on ARDisplay-I and MHC ligand identification in Molecular & Cellular Proteomics
New publication in MCP: Improving MHC ligand identification with machine learning and optimized isolation
Fluorescence microscopy image of cells stained with multiple Cell Painting dyes showing cellular organelles in green, blue, and pink, overlaid with Ardigen brand graphic elements indicating phenomics data in durg discovery
End to End Data-to-Decision Journey for AI-Driven Phenomics in Drug Discovery
Abstract network visualization representing AI-driven integration of biological data and knowledge graphs for target identification in drug discovery.
Target Identification: From Poor Data to Quality Predictions
Abstract data streams representing data sourcing in pharmaceutical research and AI drug discovery
What Are Common Data Sourcing Patterns in Pharmaceutical Research (part 3)

Contact

Ready to transform drug discovery?

Discover how one of the top AI CROs in the world, can be your trusted partner in revolutionizing drug discovery through AI.

Contact us today to learn more about our tailored solutions for empowering your drug development journey.

Send us a message and we will contact you back within 48 hours.

Newsletter

Become an insider

Be the first to know about Ardigen’s latest news and get access to our publications, webinars and more!