Abstract:
A key task of the human immune system is the recognition and surveillance of peptides presented by the HLA complex on the surface of body cells. In this way, abnormalities can be discovered rapidly to elicit targeted immune responses. The identification of the HLA-presented immunopeptidome is thus of tremendous interest for research questions ranging from basic immunological processes to the design of immunotherapies such as vaccinations against infectious diseases and cancer. With the advancement of technical developments in biological high-throughput methods such as mass spectrometry it has become possible to identify thousands of sequences of HLA-presented peptides from a single sample of cells or human tissues. This has enabled researchers to directly investigate the peptide sequences presented in the human body and gain information on their properties. However, the acquisition and evaluation of large amounts of mass spectrometry measurements and HLA peptide sequence characteristics is a highly complex task that requires the development of sophisticated experimental and computational methods. This research work has focused on the evaluation and improvement of existing methodology to identify HLA-bound peptides and further to investigate various aspects of the immunopeptidome presented by human non-malignant and cancer tissues. An essential part of this effort was the development of novel automated, digital processing pipelines for HLA immunopeptidomics data. Specifically, two pipelines - “MHCquant“ that achieved superior sensitivity in contrast to existing software solutions and “DIAproteomics“ that allowed to explore the application of the novel method of data-independent acquisition to immunopeptidomics were developed. Application of the “MHCquant“ pipeline to the currently largest existing immunopeptidomics data set of human non-malignant tissues, allowed to construct the novel data resource “The HLA Ligand Atlas“. This benign reference data set is of great significance for the comparison with diseased state tissues and was thoroughly evaluated for differences across the human population, tissue specificity and the presence of cryptic peptides from non-canonical genomic origins. Finally, the HLA immunopeptidome of multiple clinical hepatocellular carcinoma samples was analysed in combination with next generation genomic sequencing measurements in an in-depth multi-omics approach in order to discover tumor-associated mutated antigens as suitable targets for cancer immunotherapy. While the effort did not result in the determination of particular mutated antigens, it was possible to pinpoint tumor somatic mutations that are likely presented as epitopes. Ultimately, the missing findings are discussed as a consequence of technological limitations and the low mutational burden of hepatocellular carcinoma. The developed computational workflows as well as the investigated data sets were made publicly available to serve the scientific community of future generations as a standard to reanalyze and compare novel results with and advance the holistic understanding of immunological processes in the human body.