Robust and Efficient Deep Visual Learning

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.advisor Gehler, Peter Vincent (Dr.)
dc.contributor.author Prokudin, Sergey
dc.date.accessioned 2020-12-16T11:26:29Z
dc.date.available 2020-12-16T11:26:29Z
dc.date.issued 2020-12-16
dc.identifier.other 1743042825 de_DE
dc.identifier.uri http://hdl.handle.net/10900/110708
dc.identifier.uri http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1107086 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-52084
dc.description.abstract The past decade was marked by significant progress in the field of artificial intelligence and statistical learning. However, the most impressive of modern models come in the form of computationally expensive black boxes, with the majority of them lacking the ability to reason about the confidence of their predictions robustly. Being capable of quantifying model uncertainty and recognizing failure scenarios is crucial when it comes to incorporating them into complex decision-making pipelines, e.g. autonomous driving or medical image analysis systems. It is also important to maintain a low computational cost of these models. In the present thesis, the aforementioned desired properties of robustness and efficiency of deep learning models are studied and developed in the three specific realms of computer vision. First, we investigate deep probabilistic models that allow uncertainty quantification, i.e. the models that "know what they do not know". Here, we propose a novel model for the task of angular regression that allows probabilistic object pose estimation from 2D images. We also showcase how the general deep density estimation paradigm can be adapted and utilized in two other real-world applications, ball trajectory prediction and brain imaging. Next, we turn to the field of 3D shape analysis and rendering. We propose a method for efficient encoding of 3D point clouds, the type of data that is hard to handle with conventional learning algorithms due to its unordered nature. We show that simple neural networks that use the developed encoding as input can match the performance of state-of-the-art methods on various point cloud processing tasks while using orders of magnitude less floating-point operations. Finally, we explore the emerging field of neural rendering and develop the framework that connects classic deformable 3D body models with modern image-to-image translation neural networks. This combination allows efficient photorealistic human avatar rendering in a controlled manner, with the possibility to control the camera flexibly and to change the body pose and shape appearance. The thesis concludes with the discussion of the presented methods, including current limitations and future research directions. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en en
dc.subject.classification Deep learning , Maschinelles Lernen , Dimension 3 , Avatar <Informatik> de_DE
dc.subject.ddc 004 de_DE
dc.subject.other machine learning en
dc.subject.other computer vision en
dc.subject.other computer graphics en
dc.title Robust and Efficient Deep Visual Learning en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2020-12-02
utue.publikation.fachbereich Informatik de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige