Towards Geometric Understanding of Motion

Ranjan, Anurag

Publikationsdienste
→
TOBIAS-lib - Publikationen und Dissertationen
→
7 Mathematisch-Naturwissenschaftliche Fakultät
→
Dokumentanzeige

dc.contributor.advisor	Black, Michael J. (Prof. Dr.)
dc.contributor.author	Ranjan, Anurag
dc.date.accessioned	2020-07-27T07:21:53Z
dc.date.available	2020-07-27T07:21:53Z
dc.date.issued	2020-07-27
dc.identifier.other	1725608847	de_DE
dc.identifier.uri	http://hdl.handle.net/10900/103870
dc.identifier.uri	http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1038701	de_DE
dc.identifier.uri	http://dx.doi.org/10.15496/publikation-45248
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1038704	de_DE
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1038705	de_DE
dc.description.abstract	The motion of the world is inherently dependent on the spatial structure of the world and its geometry. Therefore, classical optical flow methods try to model this geometry to solve for the motion. However, recent deep learning methods take a completely different approach. They try to predict optical flow by learning from labelled data. Although deep networks have shown state-of-the-art performance on classification problems in computer vision, they have not been as effective in solving optical flow. The key reason is that deep learning methods do not explicitly model the structure of the world in a neural network, and instead expect the network to learn about the structure from data. We hypothesize that it is difficult for a network to learn about motion without any constraint on the structure of the world. Therefore, we explore several approaches to explicitly model the geometry of the world and its spatial structure in deep neural networks. The spatial structure in images can be captured by representing it at multiple scales. To represent multiple scales of images in deep neural nets, we introduce a Spatial Pyramid Network (SpyNet). Such a network can leverage global information for estimating large motions and local information for estimating small motions. We show that SpyNet significantly improves over previous optical flow networks while also being the smallest and fastest neural network for motion estimation. SPyNet achieves a 97% reduction in model parameters over previous methods and is more accurate. The spatial structure of the world extends to people and their motion. Humans have a very well-defined structure, and this information is useful in estimating optical flow for humans. To leverage this information, we create a synthetic dataset for human optical flow using a statistical human body model and motion capture sequences. We use this dataset to train deep networks and see significant improvement in the ability of the networks to estimate human optical flow. The structure and geometry of the world affects the motion. Therefore, learning about the structure of the scene together with the motion can benefit both problems. To facilitate this, we introduce Competitive Collaboration, where several neural networks are constrained by geometry and can jointly learn about structure and motion in the scene without any labels. To this end, we show that jointly learning single view depth prediction, camera motion, optical flow and motion segmentation using Competitive Collaboration achieves state-of-the-art results among unsupervised approaches. Our findings provide support for our hypothesis that explicit constraints on structure and geometry of the world lead to better methods for motion estimation.	en
dc.language.iso	en	de_DE
dc.publisher	Universität Tübingen	de_DE
dc.rights	ubt-podok	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en	en
dc.subject.classification	Deep learning
dc.subject.ddc	004	de_DE
dc.subject.other	Optical flow	en
dc.subject.other	machine vision	en
dc.subject.other	depth	en
dc.title	Towards Geometric Understanding of Motion	en
dc.type	PhDThesis	de_DE
dcterms.dateAccepted	2019-12-20
utue.publikation.fachbereich	Informatik	de_DE
utue.publikation.fakultaet	7 Mathematisch-Naturwissenschaftliche Fakultät	de_DE

Dateien:	PhD_Thesis_Official.pdf 48.6 MB PDF Beschreibung: Main Article

Das Dokument erscheint in:

7 Mathematisch-Naturwissenschaftliche Fakultät [5095]

Zur Kurzanzeige

Veröffentlichen

Stöbern

Gesamter Bestand
Diese Sammlung

Mein Benutzerkonto

Einloggen

Towards Geometric Understanding of Motion

DSpace Repositorium (Manakin basiert)

Das Dokument erscheint in:

Stöbern

Gesamter Bestand

Diese Sammlung

Mein Benutzerkonto