"Is There Choice in Non-Native Voice?" Linguistic Feature Engineering and a Variationist Perspective in Automatic Native Language Identification

DSpace Repository


Dateien:
Aufrufstatistik

URI: http://hdl.handle.net/10900/77443
http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-774435
http://dx.doi.org/10.15496/publikation-18844
Dokumentart: Dissertation
Date: 2017
Language: English
Faculty: 5 Philosophische Fakultät
Department: Allgemeine u. vergleichende Sprachwissenschaft
Advisor: Meurers, Detmar (Prof. Dr.)
Day of Oral Examination: 2017-05-12
DDC Classifikation: 400 - Language and Linguistics
Keywords: Computerlinguistik , Automatische Klassifikation , Fremdsprachenlernen , Variationslinguistik , Identifikation
Other Keywords: Zweitspracherwerb
Textklassifikation
Automatische Muttersprachenerkennung
Native Language Identification
Author Profiling
Text Classification
Second Language Acquisition
Variationist Sociolinguistics
NLI
License: Publishing license including print on demand
Order a printed copy: Print-on-Demand
Show full item record

Abstract:

Is it possible to infer the native language of an author from a non-native text? Can we perform this task fully automatically? The interest in answers to these questions led to the emergence of a research field called Native Language Identification (NLI) in the first decade of this century. The requirement to automatically identify a particular property based on some language data situates the task in the intersection between computer science and linguistics, or in the context of computational linguistics, which combines both disciplines. This thesis targets several relevant research questions in the context of NLI. In particular, what is the role of surface features and more abstract linguistic cues? How to combine different sets of features, and how to optimize the resulting large models? Do the findings generalize across different data sets? Can we benefit from considering the task in the light of the language variation theory? In order to approach these questions, we conduct a range of quantitative and qualitative explorations, employing different machine learning techniques. We show how linguistic insight can advance technology, and how technology can advance linguistic insight, constituting a fruitful and promising interplay.

This item appears in the following Collection(s)