Abstract:
Whilst the structural diversity of proteins may appear endless, even large protein complexes can be decomposed into their independently folding units, the domains. Little is known about domain emergence. Structural and sequence evidence suggests they evolved through combination of subdomain-sized fragments. To investigate this hypothesis, we searched for homologous regions among domains with a broadly different topology (fold), employing the α/β-class as defined in the Structural Classification Of Proteins (SCOP), which is believed to contain the oldest domains. We compared their sequence profiles all-against-all and found that in spite of their globally different architectures a large number of them share local homologous regions ranging from a dozen to >200 amino acids. An interesting hit constitutes the hemD-like fold, whose profile alignments provide strong evidence for its emergence via flavodoxin-like gene duplication, insertion and segment-swapping. To test this hypothesis experimentally, we reverted these evolutionary events, finding that the obtained protein in fact adopts the canonical flavodoxin-like fold. These results illustrate a way how Nature recycles a limited repertoire of building blocks, which provides a successful strategy to reach diversity at a lower molecular cost than creating every unit de novo. Such building units may have overcome a selective pressure through the course of evolution due to their function and/or intrinsic stability that allowed them to be modified and extended. Inspired by this naturally occurring strategy, I designed a cobalamin-binding chimera, extracting a portion of the binding pocket of a cobalamin-binding domain and exchanged it against its homologous region in the hemD-like fold. The resulting chimera expresses solubly, is well folded and binds cobalamin, illustrating that mimicking Nature’s combinatorial approach is a good source of soluble and well-folded proteins and may be employed as an alternative strategy to design novel functionalities.