Bregman proximal minimization algorithms, analysis and applications

DSpace Repository


Dateien:

URI: http://hdl.handle.net/10900/124684
http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1246843
http://dx.doi.org/10.15496/publikation-66047
Dokumentart: Dissertation
Date: 2022-02-18
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Mathematik
Advisor: Ochs, Peter (Prof. Dr. )
Day of Oral Examination: 2021-11-16
DDC Classifikation: 510 - Mathematics
Other Keywords:
Bregman proximal minimization
non-Euclidean distances
Bregman distance
global convergence
non-convex non-smooth optimization
License: Publishing license including print on demand
Order a printed copy: Print-on-Demand
Show full item record

Abstract:

In this thesis, we tackle the optimization of several non-smooth and non-convex objectives that arise in practice. The classical results in context of Proximal Gradient algorithms rely on the so-called Lipschitz continuous gradient property. Such conditions do not hold for many objectives in practice, including the objectives arising in matrix factorization, deep neural networks, phase retrieval, image denoising and many others. Recent development, namely, the L-smad property allows us to deal with such objectives via the so-called Bregman distances, which generalize the Euclidean distance. Based on the L-smad property, Bregman Proximal Gradient (BPG) algorithm is already well-known. In our work, we propose an inertial variant of BPG, namely, CoCaIn BPG which incorporates adaptive inertia based on the function’s local behavior. Moreover, we prove the global convergence of the sequence generated by CoCaIn BPG to a critical point of the function. CoCaIn BPG outperforms BPG with a significant margin, which is attributed to the proposed non-standard double backtracking technique. A major challenge in working with BPG based methods is designing the Bregman distance that is suitable for the objective. In this regard, we propose Bregman distances that are suitable to three applications, matrix factorization, deep matrix factorization and deep neural networks. We start with the matrix factorization setting and propose the relevant Bregman distances, then we tackle the deep matrix factorization and deep neural network settings. In all these settings, we also propose the closed form update steps for BPG based methods, which is crucial for practical application. We also propose the closed form inertia that is suitable for efficient application of CoCaIn BPG. However, until here the setting is restricted to additive composite problems and generic composite problems such as the objectives that arise in robust phase retrieval are out of the scope. In order to tackle generic composite problems, the L-smad property needs to be generalized even further. In this regard, we propose MAP property and based on which we propose Model BPG algorithm. The classical techniques of the convergence analysis based on the function value proved to be restrictive. Thus, we propose a novel Lyapunov function that is suitable for the global convergence analysis. We later unify Model BPG and CoCaIn BPG, to propose Model CoCaIn BPG for which we provide the global convergence results. We supplement all our theoretical results with relevant empirical observations to show the competitive performance of our methods compared to existing state of the art optimization methods.

This item appears in the following Collection(s)