From Algorithmic to Neural Beamforming

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.advisor Schilling, Andreas (Prof. Dr.)
dc.contributor.author Ziegler, Jonathan David
dc.date.accessioned 2022-04-01T08:48:57Z
dc.date.available 2022-04-01T08:48:57Z
dc.date.issued 2022-04-01
dc.identifier.uri http://hdl.handle.net/10900/125877
dc.identifier.uri http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1258772 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-67240
dc.description.abstract Human interaction increasingly relies on telecommunication as an addition to or replacement for immediate contact. The direct interaction with smart devices, beyond the use of classical input devices such as the keyboard, has become common practice. Remote participation in conferences, sporting events, or concerts is more common than ever, and with current global restrictions on in-person contact, this has become an inevitable part of many people's reality. The work presented here aims at improving these encounters by enhancing the auditory experience. Augmenting fidelity and intelligibility can increase the perceived quality and enjoyability of such actions and potentially raise acceptance for modern forms of remote experiences. Two approaches to automatic source localization and multichannel signal enhancement are investigated for applications ranging from small conferences to large arenas. Three first-order microphones of fixed relative position and orientation are used to create a compact, reactive tracking and beamforming algorithm, capable of producing pristine audio signals in small and mid-sized acoustic environments. With inaudible beam steering and a highly linear frequency response, this system aims at providing an alternative to manually operated shotgun microphones or sets of individual spot microphones, applicable in broadcast, live events, and teleconferencing or for human-computer interaction. The array design and choice of capsules are discussed, as well as the challenges of preventing coloration for moving signals. The developed algorithm, based on Energy-Based Source Localization, is discussed and the performance is analyzed. Objective results on synthesized audio, as well as on real recordings, are presented. Results of multiple listening tests are presented and real-time considerations are highlighted. Multiple microphones with unknown spatial distribution are combined to create a large-aperture array using an end-to-end Deep-Learning approach. This method combines state-of-the-art single-channel signal separation networks with adaptive, domain-specific channel alignment. The Neural Beamformer is capable of learning to extract detailed spatial relations of channels with respect to a learned signal type, such as speech, and to apply appropriate corrections in order to align the signals. This creates an adaptive beamformer for microphones spaced on the order of up to 100m. The developed modules are analyzed in detail and multiple configurations are considered for different use cases. Signal processing inside the Neural Network is interpreted and objective results are presented on simulated and semi-simulated datasets. en
dc.language.iso de de_DE
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en en
dc.subject.ddc 004 de_DE
dc.subject.other Neural Networks en
dc.subject.other Beamforming en
dc.subject.other LSTM en
dc.subject.other Microphone Arrays en
dc.subject.other Deep Neural Networks en
dc.subject.other Cross Correlation en
dc.subject.other Neural Beamforming en
dc.subject.other Digital Signal Processing en
dc.title From Algorithmic to Neural Beamforming en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2022-01-28
utue.publikation.fachbereich Informatik de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE
utue.publikation.noppn yes de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige