Learning to design protein-protein interactions with enhanced generalization

Bushuiev, Anton; Bushuiev, Roman; Kouba,  Petr; Filkin, Anatolii; Gabrielova, Marketa; Gabriel, Michal; Sedlar, Jiri; Pluskal4, Tomas; Damborský,  Jiří; Mazurenko,  Stanislav; Sivic, Josef

Publication details

Learning to design protein-protein interactions with enhanced generalization

Authors	BUSHUIEV Anton BUSHUIEV Roman KOUBA Petr FILKIN Anatolii GABRIELOVA Marketa GABRIEL Michal SEDLAR Jiri PLUSKAL4 Tomas DAMBORSKÝ Jiří MAZURENKO Stanislav SIVIC Josef
Year of publication	2024
Type	Article in Proceedings
Conference	12th International Conference on Learning Representations 2024
MU Faculty or unit	Faculty of Science
Citation
web	https://openreview.net/forum?id=xcMmebCT7s
Keywords	protein-protein interactions; protein design; generalization; self-supervised learning; equivariant 3D representations
Description	Discovering mutations enhancing protein-protein interactions (PPIs) is critical for advancing biomedical research and developing improved therapeutics. While machine learning approaches have substantially advanced the field, they often struggle to generalize beyond training data in practical scenarios. The contributions of this work are three-fold. First, we construct PPIRef, the largest and non-redundant dataset of 3D protein-protein interactions, enabling effective large-scale learning. Second, we leverage the PPIRef dataset to pre-train PPIformer, a new SE(3)-equivariant model generalizing across diverse protein-binder variants. We fine-tune PPIformer to predict effects of mutations on protein-protein interactions via a thermodynamically motivated adjustment of the pre-training loss function. Finally, we demonstrate the enhanced generalization of our new PPIformer approach by outperforming other state-of-the-art methods on new, non-leaking splits of standard labeled PPI mutational data and independent case studies optimizing a human antibody against SARS-CoV-2 and increasing the thrombolytic activity of staphylokinase.
Related projects:	CETOCOEN Excellence CETOCOEN Excellence ELIXIR-CZ: Czech National Infrastructure for Biological Data RECETOX research infrastructure

10 reasons why you will fall in love with MU

Ask our ambassador

Read about research at MU

Learning to design protein-protein interactions with enhanced generalization