Towards Robust Machine Learning: Benchmarking and Adaptation in Challenging Settings

Press, Ori

Publikationsdienste
→
TOBIAS-lib - Publikationen und Dissertationen
→
7 Mathematisch-Naturwissenschaftliche Fakultät
→
Dokumentanzeige

« zurück

Towards Robust Machine Learning: Benchmarking and Adaptation in Challenging Settings

Press, Ori

Dateien:	Ori_Press_PhD_Thesis.pdf 20.3 MB PDF Beschreibung: PhD Thesis

Zitierfähiger Link (URI):	http://hdl.handle.net/10900/170245 http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1702459 http://dx.doi.org/10.15496/publikation-111572
Dokumentart:	Dissertation
Erscheinungsdatum:	2025-09-15
Sprache:	Englisch
Fakultät:	7 Mathematisch-Naturwissenschaftliche Fakultät
Fachbereich:	Informatik
Gutachter:	Bethge, Matthias (Prof. Dr.)
Tag der mündl. Prüfung:	2025-07-25
Freie Schlagwörter:	benchmarking test time adapation language models computer vision
Lizenz:	http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=en
Zur Langanzeige

Abstract:

Neural networks often excel when their inputs closely match the data on which they were trained, yet they frequently fail when inputs differ even slightly from their training data. This issue, known as distribution shift, remains a significant challenge when deploying machine learning models in practical applications such as medical imaging and autonomous driving. Traditional methods to address distribution shift typically involve additional training or data collection, which may not always be feasible for models already deployed. This thesis explores alternative strategies aimed at enhancing the robustness of already trained models to distribution shifts. The first part of this work introduces a benchmark specifically designed to evaluate test-time adaptation (TTA) methods under prolonged and varied distribution shifts. Using this benchmark, we demonstrate that while existing TTA techniques initially improve performance, they often lead to performance degradation with extended adaptation. We also propose a simple baseline method capable of consistently outperforming other tested methods, maintaining high performance even throughout prolonged adaptation. Building on these insights, the second part analyzes the underlying mechanisms of entropy-based loss functions commonly employed in TTA. We show that entropy minimization initially clusters embeddings of similar images together, thus increasing accuracy. However, continued entropy minimization eventually drives input image embeddings further away from training embeddings, thereby reducing accuracy. Leveraging this insight, we propose Weighted Flips (WF), a novel method capable of predicting model accuracy on arbitrary image sets without the need for labeled data. The final part of this work extends the principles of TTA to language models (LMs), focusing on the task of literature recommendation. We propose a benchmark that evaluates LMs in their ability to infer academic papers when given a short description that references them. Our benchmark demonstrates that LMs are unable to effectively perform this task. Therefore, we propose a simple agent that allows LMs to search for and read relevant papers, significantly improving their performance.

Das Dokument erscheint in:

7 Mathematisch-Naturwissenschaftliche Fakultät [5109]

Veröffentlichen

Stöbern

Gesamter Bestand
Diese Sammlung

Mein Benutzerkonto

Einloggen

Towards Robust Machine Learning: Benchmarking and Adaptation in Challenging Settings

DSpace Repositorium (Manakin basiert)

Towards Robust Machine Learning: Benchmarking and Adaptation in Challenging Settings

Abstract:

Das Dokument erscheint in:

Stöbern

Gesamter Bestand

Diese Sammlung

Mein Benutzerkonto