The pan-NLR'ome of Arabidopsis thaliana

DSpace Repository


Dokumentart: Dissertation
Date: 2019-04-18
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Biochemie
Advisor: Weigel, Detlef (Prof. Dr.)
Day of Oral Examination: 2019-03-25
DDC Classifikation: 004 - Data processing and computer science
500 - Natural sciences and mathematics
570 - Life sciences; biology
580 - Plants (Botany)
Keywords: Pflanzen , Immunsystem
Other Keywords:
plant immunity
License: Publishing license including print on demand
Order a printed copy: Print-on-Demand
Show full item record


Plants are the major nutritional component of the human diet, provide us with shel- ter, fuel, and enjoyment. Substantial yield loss is caused by plant diseases transmitted by bacteria, fungi, and oomycete pathogens. Plants have an elaborate innate immune system to fight threatening pathogens, relying to a great extend on highly variable re- sistance (R) genes. R genes often encode intracellular nucleotide-binding leucine-rich repeat receptors (NLRs) that directly or indirectly recognize pathogens by the presence or the activity of effector proteins in the plants’ cells. NLRs contain variable N-terminal domains, a central nucleotide-binding (NB) domain, and C-terminal leucine-rich repeats (LRRs). The N-terminal domains can be used to distinguish between the evolutionary conserved NLR classes TNL (with a toll/interleucin-1 receptor homology (TIR) domain), CNL (with a coiled-coil (CC) domain), and RNL (with an RPW8 domain). The archi- tectural diversity is increased by additional integrated domains (IDs) found in different positions. Plant species have between a few dozen and several hundred NLRs. The intraspecific R gene diversity is also high, and the still few known NLRs responsible for long-term resistance are often accession-specific. Intraspecific NLR studies to date suffer from several shortcomings: The pan-NLR’omes (the collection of all NLR genes and alleles occurring in a species) can often not be comprehensively described because too few accessions are analyzed, and NLR detection is essentially always guided by reference genomes, which biases the detection of novel genes and alleles. In addition, inappropriate or immature bioinformatics analysis pipelines may miss NLRs during the assembly or annotation phase, or result in erroneous NLR annotations. Knowing the pan-NLR’ome of a plant species is key to obtain novel resistant plants in the future. I created an extensive and reliable database that defines the near-complete pan-NLR’ome of the model plant Arabidopsis thaliana. Efforts were focused on a panel of 65 diverse accessions and applied state-of-the-art targeted long read sequencing (SMRT RenSeq). My analysis pipeline was designed to include optimized methods that could be applied to any SMRT RenSeq project. In the first part of my thesis I set quality control standards for the assembly of NLR-coding genomic fragments. I further introduce a novel and thorough gene annotation pipeline, supported by careful manual curation. In the second part, I present the manuscript reporting the saturated near-complete A. thaliana pan- NLR’ome. The species-wide high NLR diversity is revealed on the domain architecture level, and the usage of novel IDs is highlighted. The core NLR complement is defined and presence-absence polymorphisms in non-core NLRs are described. Furthermore, haplotype saturation is shown, selective forces are quantified, and evolutionary coupled co-evolving NLRs are detected. The method optimization results show that final NLR assembly quality is mainly influenced by the amount and the quality of input sequencing data. The results further show that manual curation of automated NLR predictions are crucial to prevent frequently occurring misannotations. The saturation of an NLR’ome has not been shown in any plant species so far, thus this study provides an unprecedented view on intraspecific NLR variation, the core NLR complement, and the evolutionary trajectories of NLRs. IDs are more frequently used than known before, suggesting a pivotal role of noncanonical NLRs in plant-pathogen interactions. This work sets new standards for the analysis of gene families at the species level. Future NLR’ome projects applied to important crop species will profit from my results and the easy-to-adopt anal- ysis pipeline. Ultimately, this will extend our knowledge of intraspecific NLR diversity beyond few reference species or genomes, and will facilitate the detection of functional NLRs, to be used in disease resistance breeding programs.

This item appears in the following Collection(s)