Joint collaboration with Jagiellonian University
About the poster
This poster, prepared for OBayConnect 2024, presents a comparative analysis of multiple AI strategies designed to integrate chemical structure data with high-content phenotypic profiles for compound similarity matching. The goal of the project is to improve our ability to identify and group compounds based on shared biological activity by leveraging complementary data modalities.
In this study, several models—such as SimSiam, SPOCO, and anchoring methods combining ECFP or R-MAT with Cell Painting (CP) features—are evaluated using metrics like recall@10, recall@100, and median ranking position. These metrics help assess how effectively the models retrieve compounds with similar biological effects from large datasets. The results highlight the trade-offs between data types and architectures, showcasing the potential of AI-driven fusion models to accelerate drug discovery through more nuanced and biologically meaningful compound matching.
This work underscores the importance of robust benchmarking in AI-for-drug-discovery applications and provides practical insights into how combining structure-based and image-based data can refine virtual screening and repurposing pipelines.