From Points to Probability Measures: Statistical Learning on Distributions with Kernel Mean Embedding

DSpace Repository

Show simple item record

dc.contributor.advisor Schölkopf, Bernhard (Prof. Dr.) Muandet, Krikamol 2015-12-21T12:41:01Z 2015-12-21T12:41:01Z 2015
dc.identifier.other 453571611 de_DE
dc.identifier.uri de_DE
dc.description.abstract The dissertation presents a novel learning framework on probability measures which has abundant real-world applications. In classical setup, it is assumed that the data are points that have been drawn independent and identically (i.i.d.) from some unknown distribution. In many scenarios, however, representing data as distributions may be more preferable. For instance, when the measurement is noisy, we may tackle the uncertainty by treating the data themselves as distributions, which is often the case for microarray data and astronomical data where the measurement process is imprecise and replication is often required. Distributions not only embody individual data points, but also constitute information about their interactions which can be beneficial for structural learning in high-energy physics, cosmology, causality, and so on. Moreover, classical problems in statistics such as statistical estimation, hypothesis testing, and causal inference, may be interpreted in a decision-theoretic sense as machine learning problems on empirical distributions. Rephrasing these problems as such leads to novel approach for statistical inference and estimation. Hence, allowing learning algorithms to operate directly on distributions prompts a wide range of future applications. To work with distributions, the key methodology adopted in this thesis is the kernel mean embedding of distributions which represents each distribution as a mean function in a reproducing kernel Hilbert space (RKHS). In particular, the kernel mean embedding has been applied successfully in two-sample testing, graphical model, and probabilistic inference. On the other hand, this thesis will focus mainly on the predictive learning on distributions, i.e., when the observations are distributions and the goal is to make prediction about the previously unseen distributions. More importantly, the thesis investigates kernel mean estimation which is one of the most fundamental problems of kernel methods. Probability distributions, as opposed to data points, constitute information at a higher level such as aggregate behavior of data points, how the underlying process evolves over time and domains, and a complex concept that cannot be described merely by individual points. Intelligent organisms have the ability to recognize and exploit such information naturally. Thus, this work may shed light on future development of intelligent machines, and most importantly, may provide clues on the true meaning of intelligence. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri de_DE
dc.rights.uri en
dc.subject.classification Maschinelles Lernen de_DE
dc.subject.ddc 004 de_DE
dc.subject.ddc 500 de_DE
dc.subject.ddc 510 de_DE
dc.subject.other kernel methods en
dc.subject.other kernel mean embedding en
dc.subject.other statistical learning theory en
dc.subject.other reproducing kernel Hilbert space en
dc.subject.other support vector machine en
dc.subject.other support measure machine en
dc.subject.other kernel mean shrinkage estimators en
dc.subject.other distributional risk minimization en
dc.subject.other empirical risk minimization en
dc.title From Points to Probability Measures: Statistical Learning on Distributions with Kernel Mean Embedding en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2015-09-30
utue.publikation.fachbereich Informatik de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE


This item appears in the following Collection(s)

Show simple item record