From Algorithmic to Neural Beamforming

Ziegler, Jonathan David

Publikationsdienste
→
TOBIAS-lib - Publikationen und Dissertationen
→
7 Mathematisch-Naturwissenschaftliche Fakultät
→
Dokumentanzeige

dc.contributor.advisor	Schilling, Andreas (Prof. Dr.)
dc.contributor.author	Ziegler, Jonathan David
dc.date.accessioned	2022-04-01T08:48:57Z
dc.date.available	2022-04-01T08:48:57Z
dc.date.issued	2022-04-01
dc.identifier.uri	http://hdl.handle.net/10900/125877
dc.identifier.uri	http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1258772	de_DE
dc.identifier.uri	http://dx.doi.org/10.15496/publikation-67240
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1258776	de_DE
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1258777	de_DE
dc.description.abstract	Human interaction increasingly relies on telecommunication as an addition to or replacement for immediate contact. The direct interaction with smart devices, beyond the use of classical input devices such as the keyboard, has become common practice. Remote participation in conferences, sporting events, or concerts is more common than ever, and with current global restrictions on in-person contact, this has become an inevitable part of many people's reality. The work presented here aims at improving these encounters by enhancing the auditory experience. Augmenting fidelity and intelligibility can increase the perceived quality and enjoyability of such actions and potentially raise acceptance for modern forms of remote experiences. Two approaches to automatic source localization and multichannel signal enhancement are investigated for applications ranging from small conferences to large arenas. Three first-order microphones of fixed relative position and orientation are used to create a compact, reactive tracking and beamforming algorithm, capable of producing pristine audio signals in small and mid-sized acoustic environments. With inaudible beam steering and a highly linear frequency response, this system aims at providing an alternative to manually operated shotgun microphones or sets of individual spot microphones, applicable in broadcast, live events, and teleconferencing or for human-computer interaction. The array design and choice of capsules are discussed, as well as the challenges of preventing coloration for moving signals. The developed algorithm, based on Energy-Based Source Localization, is discussed and the performance is analyzed. Objective results on synthesized audio, as well as on real recordings, are presented. Results of multiple listening tests are presented and real-time considerations are highlighted. Multiple microphones with unknown spatial distribution are combined to create a large-aperture array using an end-to-end Deep-Learning approach. This method combines state-of-the-art single-channel signal separation networks with adaptive, domain-specific channel alignment. The Neural Beamformer is capable of learning to extract detailed spatial relations of channels with respect to a learned signal type, such as speech, and to apply appropriate corrections in order to align the signals. This creates an adaptive beamformer for microphones spaced on the order of up to 100m. The developed modules are analyzed in detail and multiple configurations are considered for different use cases. Signal processing inside the Neural Network is interpreted and objective results are presented on simulated and semi-simulated datasets.	en
dc.language.iso	de	de_DE
dc.language.iso	en	de_DE
dc.publisher	Universität Tübingen	de_DE
dc.rights	ubt-podok	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en	en
dc.subject.ddc	004	de_DE
dc.subject.other	Neural Networks	en
dc.subject.other	Beamforming	en
dc.subject.other	LSTM	en
dc.subject.other	Microphone Arrays	en
dc.subject.other	Deep Neural Networks	en
dc.subject.other	Cross Correlation	en
dc.subject.other	Neural Beamforming	en
dc.subject.other	Digital Signal Processing	en
dc.title	From Algorithmic to Neural Beamforming	en
dc.type	PhDThesis	de_DE
dcterms.dateAccepted	2022-01-28
utue.publikation.fachbereich	Informatik	de_DE
utue.publikation.fakultaet	7 Mathematisch-Naturwissenschaftliche Fakultät	de_DE
utue.publikation.noppn	yes	de_DE

Dateien:	Dissertation_Jonathan_David_Ziegler_From ... 13.4 MB PDF

Das Dokument erscheint in:

7 Mathematisch-Naturwissenschaftliche Fakultät [5054]

Zur Kurzanzeige

Veröffentlichen

Stöbern

Gesamter Bestand
Diese Sammlung

Mein Benutzerkonto

Einloggen

From Algorithmic to Neural Beamforming

DSpace Repositorium (Manakin basiert)

Das Dokument erscheint in:

Stöbern

Gesamter Bestand

Diese Sammlung

Mein Benutzerkonto