Semi-Supervised Source Localization in Reverberant Environments with Deep Generative Modeling

被引:0
|
作者
Bianco, Michael J. [1 ]
Gannot, Sharon [2 ]
Fernandez-Grande, Efren [3 ]
Gerstoft, Peter [1 ]
机构
[1] Marine Physical Laboratory, University of California San Diego, San Diego,CA,92093, United States
[2] Faculty of Engineering, Bar-Ilan University, Ramat-Gan,5290002, Israel
[3] Department of Electrical Engineering, Technical University of Denmark, Kongens Lyngby,2800, Denmark
基金
欧盟地平线“2020”;
关键词
Multiple signal classification - Learning systems - Music - Signal processing - Reverberation - Supervised learning - Computer music;
D O I
暂无
中图分类号
学科分类号
摘要
Localization in reverberant environments remains an open challenge. Recently, supervised learning approaches have demonstrated very promising results in addressing reverberation. However, even with large data volumes, the number of labels available for supervised learning in such environments is usually small. We propose to address this issue with a semi-supervised learning (SSL) approach, based on deep generative modeling. Our chosen deep generative model, the variational autoencoder (VAE), is trained to generate the phase of relative transfer functions (RTFs) between microphones. In parallel, a direction of arrival (DOA) classifier network based on RTF-phase is also trained. The joint generative and discriminative model, deemed VAE-SSL, is trained using labeled and unlabeled RTF-phase sequences. In learning to generate and classify the sequences, the VAE-SSL extracts the physical causes of the RTF-phase (i.e., source location) from distracting signal characteristics such as noise and speech activity. This facilitates effective end-to-end operation of the VAE-SSL, which requires minimal preprocessing of RTF-phase. VAE-SSL is compared with two signal processing-based approaches, steered response power with phase transform (SRP-PHAT) and MUltiple SIgnal Classification (MUSIC), as well as fully supervised CNNs. The approaches are compared using data from two real acoustic environments - one of which was recently obtained at Technical University of Denmark specifically for our study. We find that VAE-SSL can outperform the conventional approaches and the CNN in label-limited scenarios. Further, the trained VAE-SSL system can generate new RTF-phase samples which capture the physics of the acoustic environment. Thus, the generative modeling in VAE-SSL provides a means of interpreting the learned representations. To the best of our knowledge, this paper presents the first approach to modeling the physics of acoustic propagation using deep generative modeling. © 2013 IEEE.
引用
收藏
页码:84956 / 84970
相关论文
共 50 条
  • [31] Label-Noise Robust Deep Generative Model for Semi-Supervised Learning
    Yoon, Heegeon
    Kim, Heeyoung
    TECHNOMETRICS, 2023, 65 (01) : 83 - 95
  • [32] Multimodal deep generative adversarial models for scalable doubly semi-supervised learning
    Du, Changde
    Du, Changying
    He, Huiguang
    INFORMATION FUSION, 2021, 68 : 118 - 130
  • [33] MANIFOLD-BASED BAYESIAN INFERENCE FOR SEMI-SUPERVISED SOURCE LOCALIZATION
    Laufer-Goldshtein, Bracha
    Talmon, Ronen
    Gannot, Sharon
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6335 - 6339
  • [34] Sound Source Localization Inside a Structure Under Semi-Supervised Conditions
    Kita, Shunsuke
    Kajikawa, Yoshinobu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1397 - 1408
  • [35] Semi-supervised geodesic Generative Topographic Mapping
    Cruz-Barbosa, Raul
    Vellido, Alfredo
    PATTERN RECOGNITION LETTERS, 2010, 31 (03) : 202 - 209
  • [36] Semi-supervised Learning in Nonstationary Environments
    Ditzler, Gregory
    Polikar, Robi
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 2741 - 2748
  • [37] Semi-Supervised Response Modeling
    Lee, Hyoung-joo
    Shin, Hyunjung
    Hwang, Seong-Seob
    Cho, Sungzoon
    MacLachlan, Douglas
    JOURNAL OF INTERACTIVE MARKETING, 2010, 24 (01) : 42 - 54
  • [38] Semi-Supervised Deep Adversarial Forest for Cross-Environment Localization
    Cui, Wei
    Zhang, Le
    Li, Bing
    Chen, Zhenghua
    Wu, Min
    Li, Xiaoli
    Kang, Jiawen
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (09) : 10215 - 10219
  • [39] Fast detection and localization of mitosis using a semi-supervised deep representation
    Castro, Santiago
    Romo-Bucheli, David
    Guayacan, Luis
    Martinez, Fabio
    MEDICAL IMAGING 2023, 2023, 12471
  • [40] Semi-supervised protein subcellular localization
    Xu, Qian
    Hu, Derek Hao
    Xue, Hong
    Yu, Weichuan
    Yang, Qiang
    BMC BIOINFORMATICS, 2009, 10