Underdetermined Source Separation Based on Generalized Multichannel Variational Autoencoder

被引:18
|
作者
Seki, Shogo [1 ]
Kameoka, Hirokazu [2 ]
Li, Li [3 ]
Toda, Tomoki [4 ]
Takeda, Kazuya [5 ]
机构
[1] Nagoya Univ, Grad Sch Informat, Nagoya, Aichi 4640861, Japan
[2] NTT Corp, Atsugi, Kanagawa 2430198, Japan
[3] Univ Tsukuba, Grad Sch Syst & Informat Engn, Tsukuba, Ibaraki 3058573, Japan
[4] Nagoya Univ, Informat Technol Ctr, Nagoya, Aichi 4640861, Japan
[5] Nagoya Univ, Inst Innovat Future Soc, Nagoya, Aichi 4648603, Japan
来源
IEEE ACCESS | 2019年 / 7卷
关键词
Underdetermined source separation; variational audoencoder; non-negative matrix factorization; AUDIO SOURCE SEPARATION; NONNEGATIVE MATRIX FACTORIZATION;
D O I
10.1109/ACCESS.2019.2954120
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper deals with a multichannel audio source separation problem under underdetermined conditions. Multichannel non-negative matrix factorization (MNMF) is a powerful method for underdetermined audio source separation, which adopts the NMF concept to model and estimate the power spectrograms of the sound sources in a mixture signal. This concept is also used in independent low-rank matrix analysis (ILRMA), a special class of the MNMF formulated under determined conditions. While these methods work reasonably well for particular types of sound sources, one limitation is that they can fail to work for sources with spectrograms that do not comply with the NMF model. To address this limitation, an extension of ILRMA called the multichannel variational autoencoder (MVAE) method was recently proposed, where a conditional VAE (CVAE) is used instead of the NMF model for expressing source power spectrograms. This approach has performed impressively in determined source separation tasks thanks to the representation power of deep neural networks. While the original MVAE method was formulated under determined mixing conditions, this paper proposes a generalized version of it by combining the ideas of MNMF and MVAE so that it can also deal with underdetermined cases. We call this method the generalized MVAE (GMVAE) method. In underdetermined source separation and speech enhancement experiments, the proposed method performed better than baseline methods.
引用
收藏
页码:168104 / 168115
页数:12
相关论文
共 50 条
  • [1] Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation
    Seki, Shogo
    Kameoka, Hirokazu
    Li, Li
    Toda, Tomoki
    Takeda, Kazuya
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [2] INVESTIGATION AND COMPARISON OF OPTIMIZATION METHODS FOR VARIATIONAL AUTOENCODER-BASED UNDERDETERMINED MULTICHANNEL SOURCE SEPARATION
    Seki, Shogo
    Kameoka, Hirokazu
    Li, Li
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 511 - 515
  • [3] Supervised Determined Source Separation with Multichannel Variational Autoencoder
    Kameoka, Hirokazu
    Li, Li
    Inoue, Shota
    Makino, Shoji
    NEURAL COMPUTATION, 2019, 31 (09) : 1891 - 1914
  • [4] Multichannel Variational Autoencoder-Based Speech Separation in Designated Speaker Order
    Liao, Lele
    Cheng, Guoliang
    Ruan, Haoxin
    Chen, Kai
    Lu, Jing
    SYMMETRY-BASEL, 2022, 14 (12):
  • [5] JOINT SEPARATION AND DEREVERBERATION OF REVERBERANT MIXTURES WITH MULTICHANNEL VARIATIONAL AUTOENCODER
    Inoue, Shota
    Kameoka, Hirokazu
    Li, Li
    Seki, Shogo
    Makino, Shoji
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 96 - 100
  • [6] UNSUPERVISED SPATIAL DICTIONARY LEARNING FOR SPARSE UNDERDETERMINED MULTICHANNEL SOURCE SEPARATION
    Nesta, Francesco
    Fakhry, Mahmoud
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 86 - 90
  • [7] Source Separation in Joint Communication and Radar Systems Based on Unsupervised Variational Autoencoder
    Alaghbari, Khaled A.
    Lim, Heng Siong
    Jin, Benzhou
    Shen, Yutong
    IEEE OPEN JOURNAL OF VEHICULAR TECHNOLOGY, 2024, 5 : 56 - 70
  • [8] On underdetermined source separation
    Taleb, A
    Jutten, C
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 1445 - 1448
  • [9] SVM based underdetermined blind source separation
    Li, Rong-Hua
    Yang, Zu-Yuan
    Zhao, Min
    Xie, Sheng-Li
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2009, 31 (02): : 319 - 322
  • [10] Speech Source Separation Using Variational Autoencoder and Bandpass Filter
    Do, Hao Duc
    Tran, Son Thai
    Chau, Duc Thanh
    IEEE ACCESS, 2020, 8 : 156219 - 156231