Abstract:
The focus of this thesis was on computer-aided protein structure analysis and homology modeling. Proteins are produced in the cell according to their sequences, which are encoded in their genes. Moreover, biological function of proteins depends on their structure. Computer-aided structure prediction is based on statistically significant homology detection applying sequence comparison between a model protein and proteins with known structure. As the direct study of proteins in vitro and in vivo requires laborious experiments, prediction methods relying on homology offer practical alternatives. In this context, bioinformatics methods for sequence analysis can identify homologies between related proteins, which have evolved from a common ancestor. The more structures were solved experimentally, the more it became apparent that proteins with similar sequences predominantly share similar structural architectures (folds). An immediate application thereof is computer-aided modeling of protein structures by homology. Protein structure cannot be regarded as a rigid object, rather it exists in one defined conformational state that is related to biological function and can depend on external effects, e.g. the presence of a ligand. Because there is no signal for conformational changes at the level of sequence, sequence analyses fail to detect them. Consequently, today's structure prediction methods, which rely on sequence homology detection for modeling, may overlook alternative conformational states in a protein family. As shown in this work for the remodeling of aminotransferases, information about protein conformation can lead to better homology models. In the future, even more protein structures will be solved experimentally, yet only a few will show new and unrelated folds. Therefore, the majority of new structural data will be redundant with respect to sequence. As a result, comparative structural analyses in homology modeling, as introduced in this work, will gain in importance. This thesis consists of three parts. In the first part, the computational environment MolTalk is introduced. In a comparison to other programming libraries, which serve similar tasks, MolTalk was shown to be very fast in loading and interpreting PDB-formatted files and its memory requirements were medium. These properties are key for the structural database system MTDB and the relational sequence-to-structure system MBSIS. At the end of this first part, our structure analysis web-server iMolTalk is presented. The novel integration of structural analyses, homology modeling and database access make iMolTalk a unique and valuable service to scientists in molecular biology, who work on macromolecules and their structures.
In the second part, template selection in homology modeling is addressed in more detail. First, I describe PDBalert, a software agent, which periodically compares sequences and models of iMolTalk users against the released structures from PDB and reports new homologies by e-mail. Second, the problem of evaluating putative templates is addressed. Although these templates are homologous, they might represent different conformations, which cannot be detected by sequence comparison. As an application for comparative structural analysis I developed Protopolis, which exhaustively compares homologous structural chains and clusters them according to structural similarity. The resulting groups can then be superimposed and visualized to identify possible conformational states.
The third part extends homology modeling to multimer modeling using a combination of derived structural restraints and protein-protein docking. As a biological important application the modeling of ring assemblies of AAA+ proteins is presented. The derivation of structural restraints from known ring structures forms the basis of the multimer modeling. Subsequently, they are applied in the docking of the monomers. This leads to oligomer models that can better explain biological function of AAA+ proteins. For the apoptosome, we propose a reorientation of domains in the ring consistent with the derived structural restraints.