Computational methods for ancient genome reconstruction

DSpace Repository


Dokumentart: Dissertation
Date: 2018-06-05
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Nieselt, Kay (Apl. Prof. Dr.)
Day of Oral Examination: 2018-05-18
DDC Classifikation: 004 - Data processing and computer science
500 - Natural sciences and mathematics
570 - Life sciences; biology
930 - History of ancient world to ca. 499
Keywords: Genomik , Alte DNA , Bioinformatik
Other Keywords:
ancient DNA
License: Publishing license including print on demand
Order a printed copy: Print-on-Demand
Show full item record


Applications of next-generation sequencing (NGS) technologies have become the de facto standard in the systematic analysis of the genetic composition of organisms. Aforementioned is not just valid for modern DNA analysis, but also for ancient DNA (aDNA) where NGS methods have almost entirely replaced PCR-based approaches. Studying DNA variation in ancient humans provides promising opportunities to unravel missing links in human history that are otherwise hard to detect. Today, the methods that paleogenetics can provide to study ancient populations can change the understanding of human history to a large extent. While this is encouraging, there are still several issues in analyzing NGS data from ancient specimens, posing challenges to bioinformatics. As aDNA research projects typically produce low levels of DNA extract, bioinformatic methods have to cope with low DNA content. Furthermore, the inherent challenges of aDNA, such as DNA misincorporation patterns as well as human DNA contamination not only from modern sources, pose further challenges for the successful analysis of aDNA. Therefore, the primary research focus of bioinformatics in aDNA analysis lies on the reconstruction of ancient genomes and the subsequent data analysis of reconstructed genomes. Pipeline scripts that were used before, were limited in their applicability to few research questions and therefore required the adaption of the respective tools even for slightly different research scopes. The main topic of this dissertation concentrates on the development of EAGER, a framework for the analysis of aDNA data with a variety of use cases and improvements in contrast to previously published methods. EAGER features several newly contributed analysis methods, aiming at recovering as much aDNA as possible from sequencing experiments. Additionally, the pipeline provides an integrated solution to analyze aDNA data in an advanced way, running several state of the art analysis methods to reconstruct ancient genomes. The applicability of EAGER has been demonstrated in various aDNA analysis projects and is further illustrated within this thesis, having been applied to the reconstruction of the genome of George Bähr, the architect of Dresden Frauenkirche, and a total of 90 ancient Egyptian individuals from Abusir El-Meleq in Northern Egypt. Apart from the general processing as handled in EAGER, the handling of genomic data and metadata is of crucial importance in today's research. While sequencing projects prospered in the last couple of years, thereby producing more and more data, efficient bioinformatic tools to aid users in organizing, storing and analyzing their respective data have had difficulties in keeping up with the increasing amount of data produced. Therefore, the current situation in population genetics suffers struggles with a lack of bioinformatics methods capable of accommodating researchers with functionality to organize and analyze their data in larger sample cohorts. The second part of this thesis therefore concentrates on the conceptual introduction of MitoBench and MitoDB. The idea of MitoBench centers around the concept of an advanced analysis application that can be used by researchers to integrate their mitochondrial population genetics data with metadata from a variety of resources. In addition, the idea of MitoDB provides a centrally accessible database of mitochondrial DNA with metadata to serve as a data resource for future analysis projects within a mitochondrial population genetics context. As sequencing costs are in steady decline and thus sample sizes for research projects are growing at equal speed, novel methods and frameworks for the analysis of aDNA are required. Overall, both the EAGER framework explained in this dissertation and the conceptual ideas of MitoBench and MitoDB have contributed in understanding human history better. They will hopefully continue to aid researchers in elucidating the changes past populations have experienced.

This item appears in the following Collection(s)