Semi-Supervised Source Localization in Reverberant Environments with Deep Generative Modeling

被引：0

作者：

Bianco, Michael J. ^{[1
]}

Gannot, Sharon ^{[2
]}

Fernandez-Grande, Efren ^{[3
]}

Gerstoft, Peter ^{[1
]}

机构：

[1] Marine Physical Laboratory, University of California San Diego, San Diego,CA,92093, United States

[2] Faculty of Engineering, Bar-Ilan University, Ramat-Gan,5290002, Israel

[3] Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby,2800, Denmark

来源：

IEEE Access | 2021年 / 9卷

基金：

欧盟地平线“2020”;

关键词：

Multiple signal classification - Learning systems - Music - Signal processing - Reverberation - Supervised learning - Computer music;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Localization in reverberant environments remains an open challenge. Recently, supervised learning approaches have demonstrated very promising results in addressing reverberation. However, even with large data volumes, the number of labels available for supervised learning in such environments is usually small. We propose to address this issue with a semi-supervised learning (SSL) approach, based on deep generative modeling. Our chosen deep generative model, the variational autoencoder (VAE), is trained to generate the phase of relative transfer functions (RTFs) between microphones. In parallel, a direction of arrival (DOA) classifier network based on RTF-phase is also trained. The joint generative and discriminative model, deemed VAE-SSL, is trained using labeled and unlabeled RTF-phase sequences. In learning to generate and classify the sequences, the VAE-SSL extracts the physical causes of the RTF-phase (i.e., source location) from distracting signal characteristics such as noise and speech activity. This facilitates effective end-to-end operation of the VAE-SSL, which requires minimal preprocessing of RTF-phase. VAE-SSL is compared with two signal processing-based approaches, steered response power with phase transform (SRP-PHAT) and MUltiple SIgnal Classification (MUSIC), as well as fully supervised CNNs. The approaches are compared using data from two real acoustic environments - one of which was recently obtained at Technical University of Denmark specifically for our study. We find that VAE-SSL can outperform the conventional approaches and the CNN in label-limited scenarios. Further, the trained VAE-SSL system can generate new RTF-phase samples which capture the physics of the acoustic environment. Thus, the generative modeling in VAE-SSL provides a means of interpreting the learned representations. To the best of our knowledge, this paper presents the first approach to modeling the physics of acoustic propagation using deep generative modeling. © 2013 IEEE.

引用

页码：84956 / 84970

共 50 条

[31] Label-Noise Robust Deep Generative Model for Semi-Supervised Learning
Yoon, Heegeon
Kim, Heeyoung
TECHNOMETRICS, 2023, 65 (01) : 83 - 95
[32] Multimodal deep generative adversarial models for scalable doubly semi-supervised learning
Du, Changde
Du, Changying
He, Huiguang
INFORMATION FUSION, 2021, 68 : 118 - 130
[33] MANIFOLD-BASED BAYESIAN INFERENCE FOR SEMI-SUPERVISED SOURCE LOCALIZATION
Laufer-Goldshtein, Bracha
Talmon, Ronen
Gannot, Sharon
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6335 - 6339
[34] Sound Source Localization Inside a Structure Under Semi-Supervised Conditions
Kita, Shunsuke
Kajikawa, Yoshinobu
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1397 - 1408
[35] Semi-supervised geodesic Generative Topographic Mapping
Cruz-Barbosa, Raul
Vellido, Alfredo
PATTERN RECOGNITION LETTERS, 2010, 31 (03) : 202 - 209
[36] Semi-supervised Learning in Nonstationary Environments
Ditzler, Gregory
Polikar, Robi
2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 2741 - 2748
[37] Semi-Supervised Response Modeling
Lee, Hyoung-joo
Shin, Hyunjung
Hwang, Seong-Seob
Cho, Sungzoon
MacLachlan, Douglas
JOURNAL OF INTERACTIVE MARKETING, 2010, 24 (01) : 42 - 54
[38] Semi-Supervised Deep Adversarial Forest for Cross-Environment Localization
Cui, Wei
Zhang, Le
Li, Bing
Chen, Zhenghua
Wu, Min
Li, Xiaoli
Kang, Jiawen
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (09) : 10215 - 10219
[39] Fast detection and localization of mitosis using a semi-supervised deep representation
Castro, Santiago
Romo-Bucheli, David
Guayacan, Luis
Martinez, Fabio
MEDICAL IMAGING 2023, 2023, 12471
[40] Semi-supervised protein subcellular localization
Xu, Qian
Hu, Derek Hao
Xue, Hong
Yu, Weichuan
Yang, Qiang
BMC BIOINFORMATICS, 2009, 10

← 1 2 3 4 5 →