Low-Cost Bayesian Methods for Fixing Neural Networks' Overconfidence

DSpace Repository


Dateien:

URI: http://hdl.handle.net/10900/135535
http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1355355
http://dx.doi.org/10.15496/publikation-76886
Dokumentart: PhDThesis
Date: 2023-01-20
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Hennig, Philipp (Prof. Dr.)
Day of Oral Examination: 2023-01-13
DDC Classifikation: 004 - Data processing and computer science
Keywords: Maschinelles Lernen , Neuronales Netz
Other Keywords:
Neural Network
Bayesian Deep Learning
Uncertainty Quantification
Laplace Approximations
License: http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en
Order a printed copy: Print-on-Demand
Show full item record

Abstract:

Well-calibrated predictive uncertainty of neural networks—essentially making them know when they do not know—is paramount in safety-critical applications. However, deep neural networks are overconfident in the region both far away and near the training data. In this thesis, we study Bayesian neural networks and their extensions to mitigate this issue. First, we show that being Bayesian, even just at the last layer and in a post-hoc manner via Laplace approximations, helps mitigate overconfidence in deep ReLU classifiers. Then, we provide a cost-effective Gaussian-process extension to ReLU Bayesian neural networks that provides a guarantee that ReLU nets will never be overconfident in the region far from the data. Furthermore, we propose three ways of improving the calibration of general Bayesian neural networks in the regions near the data by (i) refining parametric approximations to the Bayesian neural networks’ posteriors with normalizing flows, (ii) training the uncertainty of Laplace approximations, and (iii) leveraging out-of-distribution data during training. We provide an easy-to-use library, laplace-torch, to facilitate the modern arts of Laplace approximations in deep learning. It gives users a way to turn a standard pre-trained deep net into a Bayesian neural network in a cost-efficient manner.

This item appears in the following Collection(s)