Learning Identifiable Representations: Independent Influences and Multiple Views

DSpace Repository


URI: http://hdl.handle.net/10900/148184
Dokumentart: PhDThesis
Date: 2023-11-28
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Schölkopf, Bernhard (Prof. Dr.)
Day of Oral Examination: 2023-06-19
DDC Classifikation: 004 - Data processing and computer science
Keywords: Maschinelles Lernen
Other Keywords:
representation learning
causal inference
machine learning
artificial intelligence
probabilistic modelling
License: http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=en
Show full item record


Intelligent systems, whether biological or artificial, perceive unstructured information from the world around them: deep neural networks designed for object recognition receive collections of pixels as inputs; living beings capture visual stimuli through photoreceptors that convert incoming light into electrical signals. Sophisticated signal processing is required to extract meaningful features (e.g., the position, dimension, and colour of objects in an image) from these inputs: this motivates the field of representation learning. But what features should be deemed meaningful, and how to learn them? We will approach these questions based on two metaphors. The first one is the cocktail-party problem, where a number of conversations happen in parallel in a room, and the task is to recover (or separate) the voices of the individual speakers from recorded mixtures—also termed blind source separation. The second one is what we call the independent-listeners problem: given two listeners in front of some loudspeakers, the question is whether, when processing what they hear, they will make the same information explicit, identifying similar constitutive elements. The notion of identifiability is crucial when studying these problems, as it specifies suitable technical assumptions under which representations are uniquely determined, up to tolerable ambiguities like latent source reordering. A key result of this theory is that, when the mixing is nonlinear, the model is provably non-identifiable. A first question is, therefore, under what additional assumptions (ideally as mild as possible) the problem becomes identifiable; a second one is, what algorithms can be used to estimate the model. The contributions presented in this thesis address these questions and revolve around two main principles. The first principle is to learn representation where the latent components influence the observations independently. Here the term “independently” is used in a non-statistical sense—which can be loosely thought of as absence of fine-tuning between distinct elements of a generative process. The second principle is that representations can be learned from paired observations or views, where mixtures of the same latent variables are observed, and they (or a subset thereof) are perturbed in one of the views—also termed multi-view setting. I will present work characterizing these two problem settings, studying their identifiability and proposing suitable estimation algorithms. Moreover, I will discuss how the success of popular representation learning methods may be explained in terms of the principles above and describe an application of the second principle to the statistical analysis of group studies in neuroimaging.

This item appears in the following Collection(s)