Abstract:
With an increase in digitalization in all areas of society, including education, data and technology offer new and exciting potentials. Systems can diagnose potential problems of learners, systems and humans can learn from data to continuously improve concrete teaching practices, and researchers can contribute to an understanding of learning in specific domains and in general. It is not entirely clear what contributes to the incredible ability of human learning. Learning Analytics can shed light on what factors contribute to learning and can help improve education, thus having a positive impact on society. Learning Analytics allows us to learn from the past to improve the future.
While data points are generated already with every interaction in digital applications – such as tutoring systems – there need to be principled ways for making sense of and using the potential of these data. Concretely, the interdisciplinary context needs to provide guidance regarding which features to consider in the process of going from raw interaction logs to interpretable feature representations. Empirical Educational Science and pedagogy need to guide the design of systems, Learning Analytics tools, and data analyses to be relevant not only from a scientific, but also a pedagogical perspective. So far, there is a glaring lack of interdisciplinary collaboration between Empirical Educational Scientists, system designers, pedagogues, Computational Linguists, and Second Language Acquisition researchers. Furthermore, little research has been conducted in this area outside of strictly constrained lab studies or studies with small or homogenous samples. In order to address this gap, in this thesis we explore Learning Analytics in the context of Intelligent Computer-Assisted Language Learning in large-scale, ecologically valid contexts.
We start by laying the foundation by describing the three main research fields this dissertation builds upon: Second Language Acquisition, Tutoring Systems, and Learning Analytics and Educational Data Mining. In the second part of the thesis, we establish an empirical basis for subsequent analyses and applications. We describe the FeedBook system and its feedback mechanisms in detail, before we turn to the FeedBook study and the data collected in it. The third part combines the two previous parts by showcasing both a range of Learning Analytics applications, targeting the needs of different user groups, and data analyses that shed light on the relation between learning process features and learning outcomes.
With respect to the Learning Analytics functions, we show how an open learner model can fulfill the needs of individual students, how information can be aggregated on the school class level and presented to fulfill the requirements of teachers, and how data about the entire learner population can be made accessible in a useful way for material designers. The findings demonstrate that raw feedback counts are not useful in predicting learning gains. Rather, answers submitted correctly to the teacher are indicative. We investigate this learning product and split it up into learning process variables. Previous knowledge manifested in answers submitted correct at first try, and uptake based on specific feedback both lead to correctly submitted answers, and are significant predictors of learning gains. In contrast, uptake based on blocked (i.e. only binary feedback) is not indicative of learning gains. Furthermore, we show an effect of time-on-task for the control group. Taken together, the results indicate that not only the provision of specific feedback, but additionally attention to feedback and efficient strategies for processing predict learning gains.
The main contributions of this thesis arise from its unique place at the interdisciplinary crossroads of Empirical Educational Science, Computational Linguistics, Second Language Acquisition, and Learning Analytics. Our primary contributions are the development and description of a tutoring system with interactive real-time feedback, the implementation of diverse Learning Analytics tools using and visualizing the learning process data collected with this system for different educational stakeholders, and the advancement of statistical models empirically linking learning process features postulated by learning theories with learning outcomes from authentic, large-scale contexts.