Probabilistic Numerical Linear Algebra for Machine Learning

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.advisor Hennig, Philipp (Prof. Dr.)
dc.contributor.author Wenger, Jonathan
dc.date.accessioned 2023-08-02T08:53:57Z
dc.date.available 2023-08-02T08:53:57Z
dc.date.issued 2023-08-02
dc.identifier.uri http://hdl.handle.net/10900/143769
dc.identifier.uri http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1437693 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-85113
dc.description.abstract Machine learning models are becoming increasingly essential in domains where critical decisions must be made under uncertainty, such as in public policy, medicine or robotics. For a model to be useful for decision-making, it must convey a degree of certainty in its predictions. Bayesian models are well-suited to such settings due to their principled uncertainty quantification, given a set of assumptions about the problem and data-generating process. While in theory, inference in a Bayesian model is fully specified, in practice, numerical approximations have a significant impact on the resulting posterior. Therefore, model-based decisions are not just determined by the data but also by the numerical method. This begs the question of how we can account for the adverse impact of numerical approximations on inference. Arguably, the most common numerical task in scientific computing is the solution of linear systems, which arise in probabilistic inference, graph theory, differential equations and optimization. In machine learning, these systems are typically large-scale, subject to noise and arise from generative processes. These unique characteristics call for specialized solvers. In this thesis, we propose a class of probabilistic linear solvers, which infer the solution to a linear system and can be interpreted as learning algorithms themselves. Importantly, they can leverage problem structure and propagate their error to the prediction of the underlying probabilistic model. Next, we apply such solvers to accelerate Gaussian process inference. While Gaussian processes are a principled and flexible model class, for large datasets inference is computationally prohibitive both in time and memory due to the required computations with the kernel matrix. We show that by approximating the posterior with a probabilistic linear solver, we can invest an arbitrarily small amount of computation and still obtain a provably coherent prediction that quantifies uncertainty exactly. Finally, we demonstrate that Gaussian process hyperparameter optimization can similarly be accelerated by leveraging structural prior knowledge in the model via preconditioning of iterative methods. Combined with modern parallel hardware, this enables training Gaussian process models on datasets with hundreds of thousands of data points. In summary, we demonstrate that interpreting numerical methods in linear algebra as probabilistic learning algorithms unlocks significant performance improvements for Gaussian process models. Crucially, we show how to account for the impact of numerical approximations on model predictions via uncertainty quantification. This enables an explicit trade-off between computational resources and confidence in a prediction. The techniques developed in this thesis have advanced the understanding of probabilistic linear solvers, they have shifted the goalposts of what can be expected from Gaussian process approximations and they have defined the way large-scale Gaussian process hyperparameter optimization is performed in GPyTorch, arguably the most popular library for Gaussian processes in Python. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en en
dc.subject.classification Numerische Mathematik , Maschinelles Lernen de_DE
dc.subject.ddc 004 de_DE
dc.title Probabilistic Numerical Linear Algebra for Machine Learning en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2023-07-25
utue.publikation.fachbereich Informatik de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE
utue.publikation.noppn yes de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige