Abstract:
Complexity classes are defined by quantitative restrictions of resources available to a computational model, like for instance the Turing machine.
Contrarily, there is no obvious commonality in the definition of families of formal languages - instead they are described by example.
This thesis is about the characterization of what makes a set of languages a family of formal languages.
Families of formal languages, like for example the regular, context-free languages and their sub-families exhibit properties that are contrasted by the ones of complexity classes.
Two of the properties families of formal languages seem to have is closure of intersection with regular languages, another is the existence of pumping or iteration arguments which yield the decidability of the emptiness.
Complexity classes do not generally have a decidable emptiness, which lead us to a first candidate for the notion of formality - the decidability of the emptiness of regular intersection (intreg).
We refute the decidability of intreg as a criterion by hiding the difficulty of deciding the emptiness of regular intersection:
We show that for every decidable language L there is a language L' of essentially the same complexity such that intreg(L') is decidable.
This implies that every complexity class contains complete languages for which the emptiness of regular intersection is decidable.
An intermediate result we show is that the set of true quantified Boolean formulae has a decidable emptiness of regular intersection.
As the known families of formal languages are all contained in NP, this yields a language (probably) outside of NP for which intreg is decidable, which additionally is a natural language in contrast to the artificial ones obtained by the hiding process.
We introduce the notion of protocol languages which capture in some sense the behavior of a data-structure underlying the model of a formal language.
They are defined in a fragment of second order logic, where the second order variables are uniquely determined by each word in the language and each letter implies a determined sub-structure of a word.
Viewing the letters of a word as vertices and the successor as edges between them, each word can be seen as a path.
The binary second order variables can be viewed as additional edges between word positions.
Therefore, each word in a protocol language defines some unique graph.
These graphs can be recognized by covering them with a predefined set of tiles which are node and edge-labeld graphs.
Additional numerical constraints on the amount of each tile-type yields shrinking-arguments for protocol languages.
If a word w in a protocol language exceeds a certain length such that the numerical constraints are (over-)satisfied, one can constuctively generate a shorter word w' from w that is also contained in the protocol language.
We define logical extensions of protocol languages by allowing the conjunction of additional first order or monadic second order definable formulae and analyze the extensions in regard to trio operations.
Protocol languages for the regular, context-free and indexed languages are exhibited -- for the first two we give protocol languages which act as generators for the respective family of formal languages.
Finally, we show that the emptiness of protocol languages is decidable.