Probabilistic Linear Algebra for Stochastic Optimization

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.advisor Hennig, Philipp (Prof. Dr.)
dc.contributor.author de Roos, Filip
dc.date.accessioned 2022-09-12T08:59:25Z
dc.date.available 2022-09-12T08:59:25Z
dc.date.issued 2022-09-12
dc.identifier.uri http://hdl.handle.net/10900/131697
dc.identifier.uri http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1316971 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-73055
dc.description.abstract The emergent field of machine learning has by now become the main proponent of data-driven discovery. Yet, with ever more data, it is also faced with new computational challenges. To make machines "learn", the desired task is oftentimes phrased as an empirical risk minimization problem that needs to be solved by numerical optimization routines. Optimization in ML deviates from the scope of traditional optimization in two regards. First, ML deals with large datasets that need to be subsampled to reduce the computational burden, inadvertently introducing noise into the optimization procedure. The second distinction is the sheer size of the parameter space which severely limits the amount of information that optimization algorithms store. Both aspects together have made first-order optimization routines a prevalent choice for model training in ML. First-order algorithms use only gradient information to determine a step direction and step length to update the parameters. Inclusion of second-order information about the local curvature has a great potential to improve the performance of the optimizer if done efficiently. Probabilistic curvature estimation for use in optimization is a recurring theme of this thesis and the problem is explored in three different directions that are relevant to ML training. By iteratively adapting the scale of an arbitrary curvature estimate it is possible to circumvent the tedious work of manually tuning the optimizer’s step length during model training. The general form of the curvature estimate naturally extends its applicability to various popular optimization algorithms. Curvature can also be inferred with matrix-variate distributions by projections of the curvature matrix. Noise can then be captured by a likelihood with non-vanishing width, leading to a novel update strategy that uses the inherent uncertainty to estimate the curvature. Finally, a new form of curvature estimate is derived from gradient observations of a nonparametric model. It expands the family of viable curvature estimates used in optimization. An important outcome of the research is to highlight the benefit of utilizing curvature information in stochastic optimization. By considering multiple ways of efficiently leveraging second-order information, the thesis advances the frontier of stochastic optimization and unlocks new avenues for research on the training of large scale ML models. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en en
dc.subject.classification Maschinelles Lernen , Optimierung , Wahrscheinlichkeit de_DE
dc.subject.ddc 004 de_DE
dc.subject.other Machine Learning en
dc.subject.other Optimization en
dc.subject.other Probability Theory en
dc.title Probabilistic Linear Algebra for Stochastic Optimization en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2022-04-07
utue.publikation.fachbereich Informatik de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE
utue.publikation.noppn yes de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige