An Analysis of the Inner Workings of Variational Autoencoders

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.advisor Martius, Georg (Dr.)
dc.contributor.author Zietlow, Urs Dominik
dc.date.accessioned 2023-01-18T11:56:33Z
dc.date.available 2023-01-18T11:56:33Z
dc.date.issued 2023-01-18
dc.identifier.uri http://hdl.handle.net/10900/135486
dc.identifier.uri http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1354867 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-76837
dc.description.abstract Representation learning, the task of extracting meaningful representations of high-dimensional data, lies at the very core of artificial intelligence research. Be it via implicit training of features in a variety of computer vision tasks, over more old-school, hand-crafted feature extraction mechanisms for, e.g., eye-tracking or other applications, all the way to explicit learning of semantically meaningful data representations. Strictly speaking, any activation of a layer within a neural network can be considered a representation of the input data. This makes the research about achieving explicit control over properties of such representations a fundamentally attractive task. An often desired property of learned representations is called disentanglement. The idea of a disentangled representation stems from the goal of separating sources of variance in the data and consolidates itself in the concept of recovering generative factors. Assuming that every data has its origin in a generative process that produces high-dimensional data given a low-dimensional representation (e.g., rendering images of people given visual attributes, such as hairstyle, camera angle, age, ...), the goal of finding a disentangled representation is to recover those attributes. The Variational Autoencoder (VAE) is a famous architecture commonly used for disentangled representation learning, and this work summarizes an analysis of its inner workings. VAEs achieved a lot of attention due to their, at the time, unparalleled performance as both generative models and inference models for learning disentangled representations. However, note that the disentanglement property of a representation is not invariant to rotations of the learned representation, i.e., rotating a learned representation can change and destroy its disentanglement quality. Given a rotationally symmetric prior over the representations space, the idealized objective function of VAEs is rotationally symmetric. Their success at producing disentangled representations consequently comes as a particular surprise. This thesis discusses why VAEs pursue a particular alignment for their representations and how the chosen alignment is correlated with the generative factors of existing representation learning datasets. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en en
dc.subject.ddc 004 de_DE
dc.title An Analysis of the Inner Workings of Variational Autoencoders en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2022-11-16
utue.publikation.fachbereich Informatik de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE
utue.publikation.noppn yes de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige