From Points to Probability Measures: Statistical Learning on Distributions with Kernel Mean Embedding

Muandet, Krikamol

Publikationsdienste
→
TOBIAS-lib - Publikationen und Dissertationen
→
7 Mathematisch-Naturwissenschaftliche Fakultät
→
Dokumentanzeige

« zurück

From Points to Probability Measures: Statistical Learning on Distributions with Kernel Mean Embedding

Muandet, Krikamol

Dateien:	thesis.pdf 5.72 MB PDF Beschreibung: PhD thesis

Zitierfähiger Link (URI):	http://hdl.handle.net/10900/67223 http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-672231 http://dx.doi.org/10.15496/publikation-8643 http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-672236 http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-672232
Dokumentart:	Dissertation
Erscheinungsdatum:	2015
Sprache:	Englisch
Fakultät:	7 Mathematisch-Naturwissenschaftliche Fakultät 7 Mathematisch-Naturwissenschaftliche Fakultät
Fachbereich:	Informatik
Gutachter:	Schölkopf, Bernhard (Prof. Dr.)
Tag der mündl. Prüfung:	2015-09-30
DDC-Klassifikation:	004 - Informatik 500 - Naturwissenschaften 510 - Mathematik
Schlagworte:	Maschinelles Lernen
Freie Schlagwörter:	kernel methods kernel mean embedding statistical learning theory reproducing kernel Hilbert space support vector machine support measure machine kernel mean shrinkage estimators distributional risk minimization empirical risk minimization
Lizenz:	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en
Gedruckte Kopie bestellen:	Print-on-Demand
Zur Langanzeige

Abstract:

The dissertation presents a novel learning framework on probability measures which has abundant real-world applications. In classical setup, it is assumed that the data are points that have been drawn independent and identically (i.i.d.) from some unknown distribution. In many scenarios, however, representing data as distributions may be more preferable. For instance, when the measurement is noisy, we may tackle the uncertainty by treating the data themselves as distributions, which is often the case for microarray data and astronomical data where the measurement process is imprecise and replication is often required. Distributions not only embody individual data points, but also constitute information about their interactions which can be beneficial for structural learning in high-energy physics, cosmology, causality, and so on. Moreover, classical problems in statistics such as statistical estimation, hypothesis testing, and causal inference, may be interpreted in a decision-theoretic sense as machine learning problems on empirical distributions. Rephrasing these problems as such leads to novel approach for statistical inference and estimation. Hence, allowing learning algorithms to operate directly on distributions prompts a wide range of future applications. To work with distributions, the key methodology adopted in this thesis is the kernel mean embedding of distributions which represents each distribution as a mean function in a reproducing kernel Hilbert space (RKHS). In particular, the kernel mean embedding has been applied successfully in two-sample testing, graphical model, and probabilistic inference. On the other hand, this thesis will focus mainly on the predictive learning on distributions, i.e., when the observations are distributions and the goal is to make prediction about the previously unseen distributions. More importantly, the thesis investigates kernel mean estimation which is one of the most fundamental problems of kernel methods. Probability distributions, as opposed to data points, constitute information at a higher level such as aggregate behavior of data points, how the underlying process evolves over time and domains, and a complex concept that cannot be described merely by individual points. Intelligent organisms have the ability to recognize and exploit such information naturally. Thus, this work may shed light on future development of intelligent machines, and most importantly, may provide clues on the true meaning of intelligence.

Das Dokument erscheint in:

7 Mathematisch-Naturwissenschaftliche Fakultät [5025]

Veröffentlichen

Stöbern

Gesamter Bestand
Diese Sammlung

Mein Benutzerkonto

Einloggen

From Points to Probability Measures: Statistical Learning on Distributions with Kernel Mean Embedding

DSpace Repositorium (Manakin basiert)

From Points to Probability Measures: Statistical Learning on Distributions with Kernel Mean Embedding

Abstract:

Das Dokument erscheint in:

Stöbern

Gesamter Bestand

Diese Sammlung

Mein Benutzerkonto