At Ardigen, we decided to take part in this exploration and use the linguistic approach to tackle questions of clinical relevance; namely, that of optimizing the binding capabilities of peptides. Our model-Protein Refinement by Intelligent Sequence Modification (PRISM)-is capable of recognizing the binding pockets in a peptide starting just from the raw amino acid sequence. Further, it is capable of estimating the “stickiness” of the pocket-by measuring its docking energy-and suggesting modifications to increase it. PRISM can deliver new insights for predicting protein-protein interactions and protein-peptide interactions. It can also suggest atypical pocket-forming sequences, opening up new avenues in pharmacology research.
Step back and picture the enormous complexity of this problem. Finding the binding properties of such a complex molecule using three-dimensional modeling seems intractable. However, PRISM learns how to solve this problem in a surprisingly intuitive fashion. PRISM follows the concept of BERT, where the model internalizes the notion of sentence, including contextual information. First, PRISM looks at enormous corpora of protein sequences and learns to perform the task of sentence completion. That is, if we present it with a partially covered sequence of amino acids, PRISM is capable of predicting the missing sequence with high accuracy. Later, PRISM is trained to recognize secondary structures, i.e. local geometry out of raw sequences. Finally, PRISM spends some training cycles looking at binding pockets until it internalizes the concept of “pocketedness”.
With this knowledge, starting from raw sequences PRISM is capable of identifying binding pockets and exploring the space of real proteins-i.e. grammatically correct sequences-in search of more bindable alternatives.
In the linguistic analogy, we advocate, drug development consists of writing the right amino acid sentences to confront disease. PRISM acts like a writing assistant helping biomedical researchers streamline their efforts. We have fashioned this tool with a user-friendly GUI enabling remote access with cloud computing and fast processing advantages. Users just need to introduce the raw amino acid sequence and select their preferences. Equipped with PRISM, our customers can streamline their research efforts by starting the discovery process with highly optimised candidates. PRISM can help users improve the catalytic properties of enzymes, and introduce substrate selectivity.