Good AI practice is no longer optional in Drug Discovery - EMA and FDA set the direction
On January 14, 2025, the European Medicines Agency (EMA) and the U.S. Food and Drug Administration (FDA) jointly published Guiding Principles for Good AI Practice in Drug Development. Although the document is not legally binding, it provides a clear and shared regulatory perspective on how artificial intelligence is expected to be developed, validated, and maintained in drug discovery and biomedical R&D.
The guidance outlines ten principles that together define what regulators consider good practice when using AI in regulated and near-regulated environments. Importantly, the focus is not limited to model performance. Instead, EMA and FDA place strong emphasis on data quality, traceability, interpretability, and lifecycle oversight – areas that often determine whether AI systems can be trusted in high-stakes scientific contexts.
One of the most concrete expectations described in the guidance is end-to-end traceability. Drug developers are encouraged to maintain detailed, auditable records of data sources, data processing steps, and model development decisions. In practice, this means being able to reconstruct how a result was produced, including where the data came from, how it was transformed, and how the model was trained. This level of transparency is essential for alignment with Good Practice (GxP) requirements and for enabling validation, audits, and long-term system maintenance.
The agencies also stress the importance of clarity of purpose. AI systems should have a clearly defined role and scope within the discovery workflow, and their outputs should be understandable and relevant to the intended user. This reflects a broader regulatory shift away from opaque, general-purpose models toward AI systems that support specific scientific questions and decision points, such as target prioritization, biomarker discovery, or patient stratification.
Another key aspect of the guidance is risk-based validation across the full system lifecycle. EMA and FDA recommend that performance assessments account not only for model behavior, but also for human–AI interaction and the context in which the system is used. Validation is treated as an ongoing responsibility rather than a one-time milestone, with regular monitoring and reassessment expected as data, models, and use cases evolve.
Taken together, these principles signal a broader change in how AI is viewed in drug discovery. The discussion is moving beyond proof-of-concept models and experimental pilots toward AI systems that are reproducible, interpretable, and operationally robust. AI is no longer positioned as an experimental add-on, but as an integral component of discovery pipelines that must withstand scientific and regulatory scrutiny.
For organizations developing AI in early-stage drug discovery, this has direct practical implications. Many of the risks highlighted by regulators originate upstream, long before any model is trained. Decisions related to data sourcing, governance, integration, and reuse strongly influence whether AI outputs can later be trusted, explained, and defended. Weak data foundations are difficult to correct downstream, regardless of model sophistication.
This regulatory direction closely aligns with how Ardigen approaches AI in drug discovery. Our work focuses on building biologically grounded, traceable data pipelines and reusable data products that support interpretation and long-term use. Rather than treating AI models as standalone assets, we design them as part of an end-to-end data journey, with explicit links between data inputs, analytical steps, and scientific conclusions.
The EMA–FDA guidance also reinforces the importance of interpretability and context, particularly as discovery increasingly relies on multimodal data. Integrating omics, imaging, phenotypic, and screening data introduces additional complexity, making provenance and clarity even more critical. Ardigen’s PhenAID platform was developed to address these challenges by enabling multimodal integration while preserving traceability and biological meaning across data types.
Finally, the guidance implicitly recognizes that building trustworthy AI is not a one-off technical exercise. It requires continuous oversight, adaptation, and collaboration as scientific questions, datasets, and regulatory expectations evolve. This is why Ardigen positions itself as a long-term partner rather than a point-solution vendor, supporting pharmaceutical and biotechnology teams across the full data and AI lifecycle.
The joint EMA–FDA principles do not introduce a sudden change. They confirm a trajectory the industry is already on, where trust, transparency, and robustness define the future of AI in drug discovery. For organizations investing in AI today, aligning with these principles is no longer optional – it is becoming a prerequisite for sustainable impact.