Improving the automated search of neural network architectures

DSpace Repository


Dateien:

URI: http://hdl.handle.net/10900/138640
http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1386403
http://dx.doi.org/10.15496/publikation-79991
Dokumentart: PhDThesis
Date: 2023-03-27
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Zell, Andreas (Prof. Dr.)
Day of Oral Examination: 2023-02-16
DDC Classifikation: 004 - Data processing and computer science
Keywords: Neuronales Netz , Maschinelles Lernen
Other Keywords:
machine learning
neural networks
neural architecture search
hyperparameter optimization
License: http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en
Order a printed copy: Print-on-Demand
Show full item record

Abstract:

Machine learning is becoming increasingly common in our society, from recommendation systems, audio assistants, and autonomous cars to gadgets like image filters for social media. Many other branches in research and industry are also planning to integrate artificial intelligence in their workflows shortly. However, developing and improving such algorithms for many specific tasks requires corresponding quantities of funding and labor, both of which are often scarce. In machine learning, automated hyper-parameter optimization techniques are widely used to find suitable training parameters such as learning rates and batch sizes. They do not just reduce the required labor but mostly exceed their human competition in speed and quality. Based on similar concepts, automatically designed neural network architectures achieved state-of-the-art performance on modern tasks for the first time in 2016. The study of such processes, known as Neural Architecture Search, quickly gained interest as a possible solution to the shortage of labor and a logical next step in the development of machine learning. This thesis focuses primarily on two aspects of neural architecture search: Firstly, we systematically analyze and improve a baseline search space for the network latency. Architectures discovered in the revised space design have an equivalent network accuracy but are twice as fast. In a second step, we investigate whether search space designs can be automated as well. The proposed Prune and Replace algorithm can progressively search through and specialize a weakly defined search space, even if it contains vastly more architectures than before. Due to multiple technical optimizations and considerations, the search requires less time than before and can discover better architectures. Secondly, we study performance predicting methods in different contexts. We conducted a large-scale hardware prediction study for various common predictors and studied in detail how multi-objective architecture search is affected by multiple factors such as predictor quality. We also evaluate a modification to super-networks, a widely used accuracy prediction approach. While the change is currently hard to apply, it results in a consistently improved selection of architectures. We conclude by presenting UniNAS, a framework built to unify various architecture search concepts and approaches in a single code base. Based on argument trees, experiments can be designed flexibly, in great detail, and even from a graphical user interface.

This item appears in the following Collection(s)