Abstract:
The spatial distribution of linguistic diversity and structural elements of language, like morphosyntactic and phonological features, is a rich source of knowledge for those interested in historical linguistics and the evolution of language.
Bayesian spatial models are a promising way to uncover these patterns.
However, modelling linguistic data in space comes with a unique set of complexities and considerations.
These include questions of how to represent the geographic locations of languages, how to measure the distance between them, and how to account for variations in topography.
In this thesis, I present a series of case studies on African languages which will illustrate different Bayesian spatial models, all of which simultaneously incorporate information about language history in the form of phylogenies or language families. Recognising that the application of spatial models in linguistic typology is a relatively recent development, I will provide a general overview of some common models used in spatial statistics and describe the approaches which have been used in this thesis.
Environmental and social factors have long been thought to influence the distribution of linguistic diversity in time and space. I will use a novel combination of methods to show that the factors which impact recent diversification are distinct from those which impact the maintenance of diversity over time.
Following that, I will examine areal patterns of structural convergence between African languages and uncover systematic variation in the diffusibility of structural elements of language.
Lastly, I will examine geographic patterns of data sparsity and discuss their implications for future statistical studies in linguistics.