"Is There Choice in Non-Native Voice?" Linguistic Feature Engineering and a Variationist Perspective in Automatic Native Language Identification

DSpace Repositorium (Manakin basiert)


Dateien:

Zitierfähiger Link (URI): http://hdl.handle.net/10900/77443
http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-774435
http://dx.doi.org/10.15496/publikation-18844
Dokumentart: Dissertation
Erscheinungsdatum: 2017
Sprache: Englisch
Fakultät: 5 Philosophische Fakultät
Fachbereich: Allgemeine u. vergleichende Sprachwissenschaft
Gutachter: Meurers, Detmar (Prof. Dr.)
Tag der mündl. Prüfung: 2017-05-12
DDC-Klassifikation: 400 - Sprache, Linguistik
Schlagworte: Computerlinguistik , Automatische Klassifikation , Fremdsprachenlernen , Variationslinguistik , Identifikation
Freie Schlagwörter: Zweitspracherwerb
Textklassifikation
Automatische Muttersprachenerkennung
Native Language Identification
Author Profiling
Text Classification
Second Language Acquisition
Variationist Sociolinguistics
NLI
Lizenz: http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en
Gedruckte Kopie bestellen: Print-on-Demand
Zur Langanzeige

Abstract:

Is it possible to infer the native language of an author from a non-native text? Can we perform this task fully automatically? The interest in answers to these questions led to the emergence of a research field called Native Language Identification (NLI) in the first decade of this century. The requirement to automatically identify a particular property based on some language data situates the task in the intersection between computer science and linguistics, or in the context of computational linguistics, which combines both disciplines. This thesis targets several relevant research questions in the context of NLI. In particular, what is the role of surface features and more abstract linguistic cues? How to combine different sets of features, and how to optimize the resulting large models? Do the findings generalize across different data sets? Can we benefit from considering the task in the light of the language variation theory? In order to approach these questions, we conduct a range of quantitative and qualitative explorations, employing different machine learning techniques. We show how linguistic insight can advance technology, and how technology can advance linguistic insight, constituting a fruitful and promising interplay.

Das Dokument erscheint in: