Causal Feature Selection in Neuroscience

DSpace Repository


Dokumentart: PhDThesis
Date: 2020-12-18
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Janzing, Dominik (PD Dr.)
Day of Oral Examination: 2020-10-16
DDC Classifikation: 004 - Data processing and computer science
500 - Natural sciences and mathematics
Keywords: Kausalität , Kausales Denken , Maschinelles Lernen , Neurowissenschaften , Motorischer Cortex , Hirnstimulation
Other Keywords:
causal feature selection
causal inference
causal discovery
motor cortex
non invasive brain stimulation
machine learning
time series
Order a printed copy: Print-on-Demand
Show full item record


Causal inference, at times correct and at times false, is fundamentally intertwined with the human nature. Humans tend to approach and explain the systems in the world and every day life via causal reasoning and causal statements, by unconsciously trying to recover the causal graph that underlies their observations. Nevertheless, causal reasoning based on observations of the real world is seldom equitable and precise. Particularly when the method that one uses is based on plain correlations, causal statements can be far from causal, first, because of the implicit assumption about linear relationships, and second, due to the major problem of hidden confounding. One of the most complex and difficult systems for an applied scientist to explain is the human brain. The reason for that is threefold. First and foremost, because of the daedal and sophisticated manner that the human brain is constructed. Secondly, because of our limited means of observing its global functionality, which ultimately leads to the problem that no causal sufficiency can be assumed in such a system. In other words, hidden common causes (also termed hidden confounders) in our limited observations will be omnipresent. Finally, the significant heterogeneity that the human brain exhibits in some of its physiological functionalities, across subjects, hinders the problem even further. This, subsequently, justifies the lack of generalization of machine learning methods that try to predict biomarkers through the traditional approach of a non-causal model, across different brains. Hence, someone should be particularly careful with the methods that she or he selects to use and the causal statements that are made, to understand and interpret the brain functionality. In this thesis, we focus on constructing theorems and algorithms for causal inference on real data, trying to understand the relationship between the human brain and motor function. More specifically, we target the problem of the identification of causes of a target variable, without assuming causal sufficiency. We tackle both the cases of non-sequential and of time series data, proving theorems for both cases accordingly. Our methods' applications have an immediate focus on the activity of the human motor cortex at the time it arises, first, naturally, and second, from non-invasive brain stimulation. We build experimental set-ups and conduct electroencephalographic (EEG) and stimulation experiments to study the functionality of the motor cortex across different subjects, during these two different cases, with an ultimate goal to explain the observed heterogeneity in the recorded activity. The work presented in this thesis is both experimental --in its first part-- with non invasive experiments on the human brain, contributing to the better understanding of the motor cortex, and theoretical, with contributions of four theorems in the field of causal inference, and two causal feature selection methods. We first attempt to approach the brain activity from a purely machine learning perspective, analysing the data of the brain activity of 27 healthy subjects during an upper-limb reaching task. We introduce a multi-task regression method to build personalised models that predict movement stability from limited trials. We do so by taking into account information from other subjects as prior and updating -when necessary- the weights of the model with trials from the current subject. Although the original goal of this work was to show the superiority of this prediction method, a side-observation turned out to be the most fundamental key to define the next steps of the hereby presented research. The learnt features by the individual prediction models differed significantly across subjects, and although no causal claim can be made yet -since this is a correlation-based observation- it is the first hint of existing heterogeneity in the activity of the human motor cortex. Such a discrepancy, in frequency and location in the learnt features, could also imply a discrepancy in the response to non-invasive brain stimulation techniques, over the motor cortex. To examine this possibility, a new series of electrophysiological experiments, with application of transcranial alternating current stimulation at 70 Hz over the motor cortex --as this has been considered to facilitate movement-- , is conducted on twenty healthy participants. At this point, having observed a significant variability in the behavioural response, ranging from negative to positive responders, we decided to further investigate the reasons that could explain it. An incremental method with three steps is introduced to narrow down the causal model that can explain the aforementioned discrepancy in responses. With our method, we conclude that the beta oscillatory activity over the motor cortex could play a mediating role between the gamma stimulation and the motor performance, without being able to exclude the case that GABA activity could be a hidden common cause. Having witnessed such a heterogeneity, both during natural movements and under brain stimulation, we stress the importance of taking steps towards personalisation of brain stimulation parameters. We conclude the experimental part of this work by constructing a pipeline, to predict from \textit{resting state} EEG data the behavioural response of each subject to the stimulation treatment. Such a screening could avoid redundant or even harmful stimulation sessions. With two different stimulation studies, recruiting in total 42 healthy participants, we identify a biomarker that could be informative about the response of an individual to the aforementioned motor stimulation. In the theoretical part of this thesis, we focus on the problem of the identification of direct and indirect causes of a target (e.g. motor performance) given a collection of possible candidates (e.g. brain activity in different locations, in different frequencies), allowing at the same time for latent common causes. First, we propose and prove a theorem which introduces sufficient conditions, under assumptions that can naturally be met, to decide for the causal role of a feature, with a single \textit{conditional independence} test, and a single conditioning variable. Given the hardness of statistical testing of conditional independences in large and dense graphs (such as the brain), limiting the necessary tests to one, significantly boosts the statistical strength of the results. Application of our conditions on the aforementioned neurophysiological data supports further the validity of the method. Applying the proposed conditions independently on each individual, without prior knowledge, led to three groups of identified causal features, each one being related in a consistent manner with different quality of movements across subjects. We discuss how such a method could contribute in the selection of personalised brain stimulation parameters. As a final step, we approach the brain signal as continuous time series data. Although time series are observed almost everywhere in nature, yet, causal inference on such data, in the presence of hidden confounders, has been an unsolved problem, with the widely known Granger Causality being the only approach for almost half a century. The final contribution of this thesis, are two theorems with which we introduce both necessary and sufficient conditions for the causal feature selection on time series, under some graph constraints, and a third theorem that relaxes one of the stricter assumptions of the aforementioned two. We demonstrate the validity of our method both on simulated and real data.

This item appears in the following Collection(s)