Efficient Workflows for Analyzing High-Performance Liquid Chromatography Mass Spectrometry-Based Proteomics Data

DSpace Repository


Dateien:

URI: http://hdl.handle.net/10900/91508
http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-915085
http://dx.doi.org/10.15496/publikation-32889
Dokumentart: PhDThesis
Date: 2019-08-15
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Kohlbacher, Oliver (Prof. Dr.)
Day of Oral Examination: 2019-07-11
DDC Classifikation: 004 - Data processing and computer science
570 - Life sciences; biology
Keywords: Proteomanalyse , Software , Algorithmus
License: http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en
Order a printed copy: Print-on-Demand
Show full item record

Abstract:

Modern high-throughput technologies in proteomics and related fields produce evergrowing amounts of complex experimental data. The wide range of techniques for quantification and identification of peptides and proteins and the wealth of available instrument types give rise to a wide range of computational challenges. As a consequence, computational data analysis has become a crucial bottleneck of the overall workflow in today’s proteomics studies. In this thesis, we present novel algorithms and tools for efficient automated data analysis of high-throughput LC-MS proteomics data, evaluate their performance in various benchmark settings, and demonstrate the successful application of our methods in a proteomics study in the field of forensics science. All tools developed in the context of this thesis are implemented in OpenMS, an open-source framework for computational mass spectrometry. We introduce TOPPAS, a dedicated workflow engine for the analysis of LC-MS proteomics data using OpenMS. TOPPAS facilitates rapid construction of complex analysis workflows and offers parallel data processing on multi-core systems. The entire data processing workflow can be designed, tested, fine-tuned, executed, and documented in a single interface, thus providing researchers with a convenient way to organize and communicate their data analyses. The successor of TOPPAS, an OpenMS plugin for the popular workflow platform KNIME, takes this approach even one step further: In addition to the data processing tools provided by OpenMS, KNIME offers a wealth of available workflow nodes for downstream data manipulation, statistical analysis, and visualization. To enable analyses that require massive compute power, we provide the KNIME2gUSE extension for KNIME, which allows to export KNIME workflows to the Grid and Cloud User Support Environment (gUSE), which executes them on powerful grid and cloud resources. Finally, we present a free plugin for the popular commercial Proteome Discoverer platform (Thermo Scientific) making OpenMS algorithms available to an even larger group of non-bioinformatics experts: LFQProfiler for label-free quantification and RNPxl for protein-RNA cross-linking data analysis. Motivated by common issues of existing approaches for label-free quantification in the context of high sample complexity, we have developed OptiQuant, a novel method for label-free quantification using mixed integer programming for globally optimal feature detection in label- free proteomics experiments. The OptiQuant workflow includes FeatureLinkerUnlabeledKD, a novel algorithm for retention time alignment and linking of corresponding signals across label-free LC-MS maps, which has become the state-of-the-art feature linking tool in OpenMS. Last, but not least, we demonstrate the successful application of TOPPAS workflows for label-free quantification of proteomics data, statistical data analysis, and machine learning to assist in the forensic reconstruction of shooting incidents. Our proof-of-principle study demonstrates that proteomics can be used to match bullets to perforated vital organs based on the protein expression profiles found in traces of organic material remaining on the bullets. In cases involving multiple shooters, this information can help answer the crucial question: who fired the lethal bullet?

This item appears in the following Collection(s)