Abstract:
Protein-protein interactions (PPIs), which mediate almost all biological processes, show a high level of complexity and heterogeneity. Despite decades of developing experimental and computational techniques, proteome-wide characterization of PPIs remains one of the grand challenges in biosciences. In this thesis, we addressed the questions of prediction and three-dimensional modelling of PPIs on the proteome level through four distinct projects. The first project encompasses an extensive survey of experimental and computational methods that enable, at least in principle, high-throughput PPI characterization at the proteome level. We emphasized their domains of validity, limitations, and potential integration to gain deeper insights into the cellular interactome. In the second project, we addressed the challenges of the evolutionary couplings approach, EVcomplex, for proteome-wide PPI prediction and structure elucidation using a fast alignment concatenation method and a probabilistic inference. This expanded the tractability of the approach five-fold over the Escherichia coli (E. coli) proteome and to the eukaryotic interactions. We then employed EVcomplex 2.0 to explore the PPI network of the E. coli cell envelope proteome, predicting and resolving hundreds of novel and known membrane PPIs that are challenging to characterize experimentally. We proposed de novo structural models of the Flagellar Hook-Filament Junction and the Tol/Pal System and demonstrated a successful application on the eukaryotic human spliceosome complex. The third project focused on exploring the interactome of the endolysosomal compartment of the HEK293 cells using cross-linking mass spectrometry, bioinformatics analysis, and structural modelling. We developed a pipeline for structural evaluation and validation of the experimentally identified cross-links based on their known structures and models. We supported our results with experimental validation of the newly discovered interactions (e.g., ATP6V1D-FZD9, GNB4- FLOT2) and structural models of interactions with known biological relevance (e.g., FLOT1-FLOT2 hetero- dimer, PPT1 tetramer). Finally, we developed a novel framework, XLEC, integrating sparse data from evolutionary coupling (EC) and cross-linking mass spectrometry (XL-MS) for efficient interactome profiling and enhanced structural modelling of PPI. XLEC integrates information from both approaches in the restraint-based modelling of complex structures and a subsequent machine learning-based model for interaction prediction. We applied XLEC to data from murine mitochondrial proteomes and showed improvement in its performance upon comparison with the individual approaches. Using XLEC, we generated hundreds of de novo PPI models revealing novel structural insights into the mammalian mitochondrial interactome. The research presented in this thesis has made a significant stride in the exploration of proteome-wide PPIs. It successfully addressed and resolved challenges of their characterization, offering new insights into the complex landscape of protein interactions, and paving the way for further advancements in the field.