Clinical trials are the gold standard for assessing the safety and efficacy of new therapies. Still, underpowered studies, strict eligibility criteria, and unpredictable treatment responses can obscure meaningful results and even derail promising drugs. Artificial intelligence can help identify the right patients for the proper trials and ultimately improve the chances of bringing effective treatments to patients.
Challenges in Clinical Trial Design Due to Patient Response Variability
One of the major concerns in clinical trials is the heterogeneous response of patients to treatment. Even within the same disease, patients may differ in genetic profile, biomarkers, comorbidities, or lifestyles.
Conditions like pulmonary arterial hypertension (PAH) exhibit such significant patient heterogeneity, leading to varied eligibility criteria across studies. This means that trials designed for the “average” patient with PAH often fail to capture the actual effects of therapies in general patient populations. As a result, drug efficacy may be either underestimated or overestimated, and potential benefits for specific subgroups of patients remain uncovered [1].
Such trials are unlikely to yield significant results, thus not contributing to the advancement of knowledge. They also pose an unjustified risk to patients and are a waste of the sponsors’ resources.
This is the point where machine learning helps. ML algorithms can analyze high-dimensional datasets from genomics and transcriptomics to imaging and clinical variables, identifying patterns that are difficult for humans to spot. This enables:
variables, identifying patterns that are difficult for humans to spot. This enables:
- Identifying clusters of patients who respond similarly to a given therapy,
- Predicting treatment response based on individual patient characteristics,
- Extracting biomarkers that allow for earlier and more accurate prediction of therapeutic outcomes,
- Optimizing trial eligibility criteria so that study populations better reflect real-world patient diversity.
The use of ML to patient heterogeneity may apply at several stages of clinical trials:
- Trial design (preclinical and early clinical phases): analyzing existing data to design more representative inclusion and exclusion criteria.
- Recruitment (Phase II–III): speed up finding potential trial participants who meet clinical trial eligibility criteria.
- Data analysis (Phase III–IV): identifying subpopulations that benefit most or are prone to adverse events.
This approach increases the likelihood of clinical trial success and accelerates the development of precision medicine, where treatments are tailored to the individual rather than the statistical average.
Machine Learning Models Can Accurately Predict Treatment Responders in Various Medical Contexts
ML models often outperform traditional statistical approaches and clinical guidelines in identifying responders and non-responders.
Models using gene expression or multi-omics profiles (e.g., RNA-seq, mutation data) can predict drug response in cancer with high accuracy, exceeding 80% in some studies. Support vector machines, random forests, and deep learning architectures are commonly used, with feature selection and data integration improving performance. Pretraining on large, unlabeled datasets and incorporating drug chemical features further enhances predictive power [2-6].
Imaging features (e.g., radiomics) are frequently combined with clinical or omics data in ML models to improve response prediction, especially in oncology [2, 4].
ML models trained on clinical variables (e.g., demographics, lab values, disease characteristics) have been shown to predict response to therapies such as cardiac resynchronization therapy (CRT) and neoadjuvant chemotherapy in breast cancer. Models like logistic regression, random forest, and gradient boosting achieve AUCs of 0.77–0.81, outperforming guideline-based approaches [7, 12].
Application Area | Data Types Used | Best Model(s) | Performance (AUC/Accuracy)* | Key Predictors/Features | Citations |
|---|---|---|---|---|---|
Cancer drug response | Omics (gene expression) | SVM, RF, DNN | >80% accuracy | Gene signatures, molecular fingerprints | [2-6] |
CRT response | Clinical | LR, EN, Ridge, NB | 0.70–0.77 (AUC) | QRS morphology, LBBB, LVESD, PCI | [7-9] |
Breast cancer NAC | Clinical Pathology | LightGBM, RF | 0.79–0.81 (AUC) | ER/PR/HER2 status, age, tumor size | [10-12] |
Immunotherapy response | Transcriptomics | LR, SVC, RF | Superior to conventional | Network-based pathway biomarkers | [6] |
Tab 1. Comparison of ML models, data types, and performance in predicting responders.
*In this context, performance is often reported as the area under the ROC curve (AUC), which reflects how well a model can distinguish responders from non-responders (where 0.5 indicates random guessing and 1.0 indicates perfect classification).
Integrating Multiple Data Modalities Significantly Enhances The Development Of Robust Response Signatures
Relying on just one type of patient data frequently gives an incomplete picture of how someone will respond to treatment. That’s why researchers increasingly use multimodal (or multi-omics) approaches, which combine different kinds of data such as genomics, transcriptomics, and immune profiling.
By examining these data sources together, machine learning models can identify biomarker setups that couldn’t be found in a single dataset. For instance, a combination of gene mutations, gene expression levels, and immune cell profiles can reveal why some patients respond well to a therapy while others don’t [13-15].
This data integration has a clear impact: studies show that adding genetic and transcriptomic data to traditional clinical information can dramatically improve prediction accuracy. In breast cancer, for example, this approach nearly doubled the accuracy of predicting drug response compared to using clinical data alone (to 0.978, compared to 0.51–0.82 for clinical data) [13-16].
In practice, this means clinical trials could become more efficient by identifying likely responders earlier, reducing trial size or duration, and focusing resources on the patients who are most likely to benefit.
How Can AI Identify Responders to Immunotherapies and Targeted Agents
In oncological drug development, predicting response to biological treatments is a burning problem. For our client in an oncology setting, we developed a solution to differentiate responders from non-responders, resulting in a 10% improvement in ROC AUC compared to the baseline model. All thanks to the multimodal AI approach integrating tissue morphology graphs, histology data and clinical information.
Fig. 1. ROC AUC 10% rise in responder/non-responder prediction in model trained on multimodal data (histology, tissue morphology graphs and clinical data). For more info, visit the case study site.
Ensemble deep learning models, such as the ELISE pipeline are another effective strategy. An ensemble refers to combining the predictions of multiple models, rather than relying solely on one. The idea is that different models may capture different patterns in the data. By pooling them, you reduce the risk of error from any single model and usually get better overall performance.
They have demonstrated high accuracy in predicting immunotherapy response using genetic and transcriptomic features. Reported results include:
- AUCs up to 1.00 (outstanding discrimination) for atezolizumab in esophageal adenocarcinoma and metastatic melanoma,
- and 0.89 (excellent discrimination) for PD-1/PD-L1 inhibitors in metastatic urothelial cancer [17].
Computer vision techniques such as SOTA extend possibilities. These methods excel at analyzing images, can precisely segment cancer lesions on CT scans or histology slides, extract quantitative features and link them to the treatment response.
Ardigen’s experts developed an SOTA-based model for advanced segmentation of lesions in lung cancer computer tomography (CT) scans. This approach can also be successfully extended to identify best responders.
Similarly, Trebeschi and collaborators demonstrated how AI-driven CT scans analysis can noninvasively predict immunotherapy response, achieving lesion-level AUCs reaching 0.83 in non-small cell lung carcinoma and patient-level AUCs of up to 0.76, linked to proliferative pathway activity [18].
Above examples show the ways may vary, but whether through integrating multi-omics, building ensemble models, or applying computer vision, the goal remains the same: to easily find patients who will respond better to treatment.
A Promise That Raises Ethical Implications
Artificial intelligence tools for patient selection in healthcare offer improved outcomes but introduce complex regulatory and ethical concerns. Among them are data privacy, algorithmic bias, transparency, accountability, informed consent, and the impact on patient autonomy and trust.
AI can perpetuate or amplify biases present in training data, leading to unfair or discriminatory patient selection and exacerbating health disparities. Regular audits, diverse datasets, and bias mitigation strategies are needed to ensure equity.
Many AI models operate as “black boxes,” making it difficult to understand or challenge their reasoning. Explainable AI (XAI) and clear communication are critical for trust and informed decision-making. Foggy legal regulations also complicate the responsibility for AI-driven choices, especially in cases of harm. Current laws lag technological advances, and there is a great need for strict guidelines on liability [19].
Regulatory bodies are independently working on developing adaptive frameworks, but we need global harmonization and continuous oversight.
Leverage Patient Stratification in Your Clinical Trials
Thanks to predicting better responders, advanced AI models enable more reliable, ethical, and efficient trials. At Ardigen, we combine state-of-the-art and explainable AI solutions with deep biomedical expertise to help sponsors and CROs optimize patient selection and trial design.
If you want to learn how our AI-driven platforms can accelerate your research, contact our team.
Author: Martyna Piotrowska
Technical editing: Ardigen expert: Anna Sanecka-Duin PhD
Bibliography:
- Fogel, DB. Factors associated with clinical trials that fail and opportunities for improving the likelihood of success: A review. Contemporary Clinical Trials Communication. 2018 Aug 7. https://doi.org/10.1016/j.conctc.2018.08.001.
- Adam, G., Rampášek, L., Safikhani, Z., Smirnov, P., Haibe-Kains, B., & Goldenberg, A. Machine learning approaches to drug response prediction: challenges and recent progress. NPJ Precision Oncology. 2020; 4. https://doi.org/10.1038/s41698-020-0122-1
- Machine learning predicts individual cancer patient responses to therapeutic drugs with high accuracy. Scientific Reports. 2018; 8. https://doi.org/10.1038/s41598-018-34753-5
- Rafique, R., Islam, S., & Kazi, J. Machine learning in the prediction of cancer therapy. Computational and Structural Biotechnology Journal. 2021; 19. https://doi.org/10.1016/j.csbj.2021.07.003
- Baptista, D., Ferreira, P., & Rocha, M. Deep learning for drug response prediction in cancer. Briefings in bioinformatics. 2020; 22(1). https://doi.org/10.1093/bib/bbz171
- Kong, J., Ha, D., Lee, J., Kim, I., Park, M., Im, S., Shin, K., & Kim, S. Network-based machine learning approach to predict immunotherapy response in cancer patients. Nature Communications. 2022; 13. https://doi.org/10.1038/s41467-022-31535-6
- Feeny, A., Rickard, J., Patel, D., Toro, S., Trulock, K., Park, C., Labarbera, M., Varma, N., Niebauer, M., Sinha, S., Gorodeski, E., Grimm, R., Ji, X., Barnard, J., Madabhushi, A., Spragg, D., & Chung, M. Machine Learning Prediction of Response to Cardiac Resynchronization Therapy. Circulation: Arrhythmia and Electrophysiology. 2019 https://doi.org/10.1161/CIRCEP.119.007316
- Liang, Y., Ding, R., Wang, J., Gong, X., Yu, Z., Pan, L., Huang, J., Li, R., Su, Y., Zhu, S., & Ge, J. Prediction of response after cardiac resynchronization therapy with machine learning. International journal of cardiology. 2021 https://doi.org/10.1016/j.ijcard.2021.09.049
- Liang, Y., Ding, R., Zhu, S., Su, Y., & Ge, J. Development of machine learning models to predict response after cardiac resynchronization therapy. European Heart Journal. 2020 https://doi.org/10.1093/ehjci/ehaa946.0797
- Kim, J., Jeon, E., Kwon, S., Jung, H., Joo, S., Park, Y., Lee, S., Lee, J., Nam, S., Cho, E., Park, Y., Ahn, J., & Im, Y. Prediction of pathologic complete response to neoadjuvant chemotherapy using machine learning models in patients with breast cancer. Breast Cancer Research and Treatment. 2021; 189. https://doi.org/10.1007/s10549-021-06310-8
- Rahadian, R., Tan, H., Ho, B., Kumaran, A., Villanueva, A., Sng, J., Tan, R., Tan, T., Tan, V., Tan, B., Lim, G., Cai, Y., Nei, W., & Wong, F. Using Machine Learning Models to Predict Pathologic Complete Response to Neoadjuvant Chemotherapy in Breast Cancer. JCO Clinical Cancer Informatics. 2024; 8. https://doi.org/10.1200/CCI.24.00071
- Zhao, F., Polley, E., McClellan, J., Howard, F., Olopade, O., & Huo, D. Predicting pathologic complete response to neoadjuvant chemotherapy in breast cancer using a machine learning approach. Breast Cancer Research: BCR. 2024; 26. https://doi.org/10.1186/s13058-024-01905-7
- Rashid, M., & Selvarajoo, K. Advancing drug-response prediction using multi-modal and -omics machine learning integration (MOMLIN): a case study on breast cancer clinical data. Briefings in Bioinformatics. 2024; 25. https://doi.org/10.1093/bib/bbae300
- Cao, Z., & Gao, G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nature Biotechnology. 2022; 40. https://doi.org/10.1038/s41587-022-01284-4
- Argelaguet, R., Arnol, D., Bredikhin, D., Deloro, Y., Velten, B., Marioni, J., & Stegle, O. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. Genome Biology. 2020; 21. https://doi.org/10.1186/s13059-020-02015-1
- Paltun, B., Mamitsuka, H., & Kaski, S. Improving drug response prediction by integrating multiple data sources: matrix factorization, kernel and network-based approaches. Briefings in Bioinformatics. 2019; 22. https://doi.org/10.1093/bib/bbz153
- Jin, W., Yang, Q., Chi, H., Wei, K., Zhang, P., Zhao, G., Chen, S., Xia, Z., & Li, X. Ensemble deep learning enhanced with self-attention for predicting immunotherapeutic responses to cancers. Frontiers in Immunology. 2022; 13. https://doi.org/10.3389/fimmu.2022.1025330
- Trebeschi, S., Drago, S., Birkbak, N., Birkbak, N., Birkbak, N., Kurilova, I., Calin, A., Pizzi, A., Lalezari, F., Lambregts, D., Rohaan, M., Parmar, C., Rozeman, E., Hartemink, K., Swanton, C., Swanton, C., Haanen, J., Blank, C., Smit, E., Beets-Tan, R., Aerts, H., & Aerts, H. Predicting response to cancer immunotherapy using noninvasive radiomic biomarkers. Annals of Oncology. 2019; 30. https://doi.org/10.1093/annonc/mdz108
- Zhang, J., & Zhang, Z. Ethics and governance of trustworthy medical artificial intelligence. BMC Medical Informatics and Decision Making. 2023; 23. https://doi.org/10.1186/s12911-023-02103-9
- Mennella, C., Maniscalco, U., De Pietro, G., & Esposito, M. Ethical and regulatory challenges of AI technologies in healthcare: A narrative review. Heliyon. 2024; 10. https://doi.org/10.1016/j.heliyon.2024.e26297
- Jeyaraman, M., Balaji, S., Jeyaraman, N., & Yadav, S. Unraveling the Ethical Enigma: Artificial Intelligence in Healthcare. Cureus. 2023; 15. https://doi.org/10.7759/cureus.43262
- Elendu, C., Amaechi, D., Elendu, T., Jingwa, K., Okoye, O., Okah, M., Ladele, J., Farah, A., & Alimi, H. Ethical implications of AI and robotics in healthcare: A review. Medicine. 2023; 102. https://doi.org/10.1097/MD.0000000000036671
- Hanna, M., Pantanowitz, L., Jackson, B., Palmer, O., Visweswaran, S., Pantanowitz, J., Deebajah, M., & Rashidi, H. Ethical and Bias Considerations in Artificial Intelligence (AI)/Machine Learning. Modern pathology: an official journal of the United States and Canadian Academy of Pathology, Inc. 2024. https://doi.org/10.1016/j.modpat.2024.100686
- Goktas, P., & Grzybowski, A. Shaping the Future of Healthcare: Ethical Clinical Challenges and Pathways to Trustworthy AI. Journal of Clinical Medicine. 2025; 14. https://doi.org/10.3390/jcm14051605
- Weiner, E., Dankwa-Mullan, I., Nelson, W., & Hassanpour, S. Ethical challenges and evolving strategies in the integration of artificial intelligence into clinical practice. PLOS Digital Health. 2024; 4. https://doi.org/10.1371/journal.pdig.0000810
- Čartolovni, A., Tomičić, A., & Mosler, E. Ethical, legal, and social considerations of AI-based medical decision-support tools: A scoping review. International journal of medical informatics. 2022; 161. https://doi.org/10.1016/j.ijmedinf.2022.104738
- Afnan, M., Liu, Y., Conitzer, V., Rudin, C., Mishra, A., & Savulescu, J. O-098 Embryo selection using Artificial Intelligence (AI): Epistemic and ethical considerations. Human Reproduction. 2021 https://doi.org/10.1093/humrep/deab125.034
- Cestonaro, C., Delicati, A., Marcante, B., Caenazzo, L., & Tozzo, P. Defining medical liability when artificial intelligence is applied on diagnostic algorithms: a systematic review. Frontiers in Medicine. 2023; 10. https://doi.org/10.3389/fmed.2023.1305756
- Naik, N., Hameed, B., Shetty, D., Swain, D., Shah, M., Paul, R., Aggarwal, K., Ibrahim, S., Patil, V., Smriti, K., Shetty, S., Rai, B., Chłosta, P., & Somani, B. Legal and Ethical Consideration in Artificial Intelligence in Healthcare: Who Takes Responsibility? Frontiers in Surgery. 2022; 9. https://doi.org/10.3389/fsurg.2022.862322
- Benzinger, L., Ursin, F., Balke, W., Kacprowski, T., & Salloch, S. Should Artificial Intelligence be used to support clinical ethical decision-making? A systematic review of reasons. BMC Medical Ethics. 2023; 24. https://doi.org/10.1186/s12910-023-00929-6