AACR POSTER Predicting immunogenic neoepitopes with biology-aware machine learning
Giovanni Mazzocco, Oleksandr Myronov, Piotr Skoczylas, Jan Kaczmarczyk, Iga Niemiec, Katarzyna Gruba, Anna Sanecka-Duin, Piotr Stepniak
Introduction: The purpose of this study is to present a novel method for neoepitope prediction, along with extensive benchmarks to existing solutions and identification of the most predictive constituent biological features. The accurate prediction of neoepitope immunogenicity represents an invaluable tool for the design of personalized cancer vaccines with effective treatment outcomes. The effectiveness of the host’s adaptive immune response against cancer relies on the correct HLA-mediated neoepitope presentation and the recognition by specific CD8+ clones. Cancer immunotherapies act by boosting the activity of these effector T-cells. Methods: The proposed AI-driven bioinformatics solution allows to perform an accurate prediction of HLA I-restricted neoepitope immunogenicity by including several analytical modules simulating biological processes leading to the activation of CD8+ T-cells. These modules can be flexibly composed on the basis of the data available and include: (i) selection of potential neoepitope based on cancer NGS data, (ii) expression of neoepitope-associated genes, (iii) similarity to self, (iv) prediction of neoepitope-HLA binding affinity and stability, (v) effect of post-translational modification on neoepitope presentation and TCR recognition, (vi) prediction of neoepitope:TCR recognition probability based on neoepitope-HLA:TCR structural-derived features and TCR CDR3 sequence similarities. The model was trained on a curated dataset joining records of multiple studies containing neoepitopes experimentally validated for their ability to elicit adaptive immune response.
Results: We present the results of a benchmarking study where we compare the performance of widely used methods for the assessment of neoepitope immunogenicity along with our solution. We investigate the predictive power of each analytical module to determine the individual information gain and determine the most informative ones. All the methods included in the benchmarking study were tested using the dataset provided by Chowell et al. 2015. As the metric for performance evaluation we use ROC AUC and the overall percentage of correctly detected immunogenic neoepitopes (precision).
Conclusion: The results present the utility of the compared methods for personalized cancer vaccine design and the importance of modules related to biological processes underlying the neoepitope presentation and immunogenicity.