GLMSNET: SINGLE CHANNEL SPEECH SEPARATION FRAMEWORK IN NOISY AND REVERBERANT ENVIRONMENTS

被引:1
|
作者
Shi, Huiyu [1 ]
Chen, Xi [2 ]
Kong, Tianlong [1 ]
Yin, Shouyi [1 ]
Ouyang, Peng [2 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] AI Lab, Lenovo Res, Beijing, Peoples R China
关键词
Speech separation; speech enhancement; cock-tail party problem; reverberation;
D O I
10.1109/ASRU51503.2021.9688217
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real noisy and reverberant environments, the performance of current single channel speech separation algorithms decreases significantly. Given this situation, this paper proposes a novel speech separation framework, called Graph convolution and Leading global Multi-scale separation network (GLMSnet). The graph convolution network (GCN) is introduced on high-level features for modeling global context and incorporating long-range information, and it can be arbitrarily inserted into the desired position. Furthermore, Global multi-scale convolution is proposed to aggregate different levels features and improve the audio quality of separation. The leading factor is applied to increase valid information of target speech. We evaluate our method on WHAMR! Database. The results show that our proposed method can obtain state-of-the-art speech separation effect in the presence of noise and reverberation. Compared with the most advanced model before, the performance is improved by 22.7%.
引用
收藏
页码:663 / 670
页数:8
相关论文
共 50 条
  • [31] Modulation domain blind speech separation in noisy environments
    Zhang, Yi
    Zhao, Yunxin
    SPEECH COMMUNICATION, 2013, 55 (10) : 1081 - 1099
  • [32] Speech Privacy Protection based on Optimal Controlling Estimated Speech Transmission Index in Noisy Reverberant Environments
    Duangpummet, Suradej
    Kraikhun, Phrimphissa
    Phunruangsakao, Chatrin
    Karnjana, Jessada
    Unoki, Masashi
    Kongprawechnon, Waree
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 76 - 80
  • [33] FEATURE DENOISING FOR SPEECH SEPARATION IN UNKNOWN NOISY ENVIRONMENTS
    Wang, Yuxuan
    Wang, DeLiang
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7472 - 7476
  • [34] Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech
    Valentini-Botinhao, Cassia
    Yamagishi, Junichi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (08) : 1420 - 1433
  • [35] SPATIAL DIFFUSENESS FEATURES FOR DNN-BASED SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS
    Schwarz, Andreas
    Huemmer, Christian
    Maas, Roland
    Kellermann, Walter
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4380 - 4384
  • [36] AMPLITUDE MODULATION SPECTROGRAM BASED FEATURES FOR ROBUST SPEECH RECOGNITION IN NOISY AND REVERBERANT ENVIRONMENTS
    Moritz, Niko
    Anemueller, Joern
    Kollmeier, Birger
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5492 - 5495
  • [37] TRAINING NOISY SINGLE-CHANNEL SPEECH SEPARATION WITH NOISY ORACLE SOURCES: A LARGE GAP AND A SMALL STEP
    Maciejewski, Matthew
    Shi, Jing
    Watanabe, Shinji
    Khudanpur, Sanjeev
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5774 - 5778
  • [38] A PROGRESSIVE ENHANCEMENT METHOD FOR NOISY AND REVERBERANT SPEECH
    Shu, Xiaofeng
    Zhou, Yi
    Cao, Yin
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [39] Speech Intelligibility Enhancement in Noisy Reverberant Conditions
    Li, Junfeng
    Xia, Risheng
    Fang, Qiang
    Li, Aijun
    Yan, Yonghong
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [40] Design of the Wiener gain in noisy and reverberant environments
    Xiang, Qian
    Chen, Jingdong
    Benesty, Jacob
    Lei, Tao
    Pan, Chao
    APPLIED ACOUSTICS, 2025, 231