Toward Constrained Animal Pose Estimation

DSpace Repository


Dokumentart: PhDThesis
Date: 2023-09-27
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Macke, Jakob H. (Prof. Dr.)
Day of Oral Examination: 2023-07-04
DDC Classifikation: 004 - Data processing and computer science
500 - Natural sciences and mathematics
570 - Life sciences; biology
Keywords: Maschinelles Lernen , Pose , Neurowissenschaften , Künstliche Intelligenz
Other Keywords:
Machine Learning
Pose Estimation
Artificial Inteligence
Order a printed copy: Print-on-Demand
Show full item record


Quantifying animal behavior is a crucial aspect of the ongoing neuroscientific endeavor to understand the brain, since it is a prerequisite for studying how neural computations relate to behavioral outputs. One method for obtaining an objective yet detailed description of an animal's unconstrained and therefore natural behavior is given by estimating its pose, i.e. the collective positions and orientations of all individual body parts in space at a given moment in time. While various approaches have been proposed for estimating the pose of a freely-moving animal, so far, studies relying on video cameras for recording the required behavioral data have neglected reconstructing the actual skeleton of an animal and only considered inferring the positions of anatomical landmarks located on its body surface. Additionally, many approaches lack incorporating mechanistic knowledge of an animal's anatomy, which leaves room for improving the resulting pose reconstruction accuracy. Consequently, methods for quantifying skeletal animal poses during free motion sequences are desirable tools for future neuroscientific studies. The work presented in this thesis tackles the problem of inferring skeletal poses from recorded video data of freely-moving animal subjects via a constrained animal pose estimation framework, which enables reconstructing underlying three-dimensional joint positions from observable surface markers while enforcing anatomical and temporal constraints. Anatomical constraints are implemented via a realistic skeleton model, which accounts for physiological joint angle limits, bone lengths and body symmetry. Besides, the realistic skeleton model allows for learning individual skeleton anatomies directly from recorded video data of behaving animals, taking into account subject-specific differences with respect to bone lengths and body-symmetry. Furthermore, to ensure that reconstructed joint positions follow smooth motion trajectories, the proposed animal pose estimation framework also enforces temporal constraints. Particularly, temporal constraints are implemented via an underlying state space model, which allows for deploying a Bayesian smoother for inferring bone rotations as well as an expectation-maximization algorithm for learning the unknown probabilistic hyper-parameters of the state space model. The proposed animal pose estimation framework is evaluated and tested with respect to its reconstruction accuracy and usability for quantifying a range of different behaviors. By comparing learned skeleton anatomies with ground truth data obtained via magnetic resonance imaging, it is shown that the framework offers the opportunity to learn three-dimensional joint positions and bone lengths solely from two-dimensional video data. Besides, to test whether poses of freely-moving animals are accurately inferred, independently measured paw positions are obtained using a frustrated total internal reflection imaging system and compared to their reconstructed counterparts, while the effects of the enforced anatomical and temporal constraints are analyzed. This analysis shows the advantages of constrained over unconstrained animal pose estimation, since enforcing constraints reduces errors with respect to reconstructed paw positions and orientations. Furthermore, to assess if the proposed pose estimation framework is capable of accurately quantifying common behaviors, periodic gait cycles are analyzed based on reconstructed skeletal poses, which shows that enforcing constraints is essential for successfully extracting characteristic movement patterns from recorded video data. Finally, the proposed pose estimation framework is also used to quantify complex gap-crossing behaviors, where animals jump over gaps of various distances. This analysis shows that reconstructing skeletal poses enables computing characteristic movement patterns during jumping and correlating skeletal kinematic quantities with each other as well as the jumped distances. In summary, this thesis proposes an animal pose estimation framework, which allows for reconstructing anatomically-plausible as well as time-consistent three-dimensional skeletal poses of freely-moving animals from two-dimensional video data. To achieve this, anatomical and temporal constraints are implemented into the proposed pose reconstruction framework, which transpired to be essential for obtaining accurate pose reconstruction results. Consequently, this thesis contains analyses, which demonstrate the importance of the implemented constraints in the context of animal pose estimation.

This item appears in the following Collection(s)