GLMSNET: SINGLE CHANNEL SPEECH SEPARATION FRAMEWORK IN NOISY AND REVERBERANT ENVIRONMENTS

被引:1
|
作者
Shi, Huiyu [1 ]
Chen, Xi [2 ]
Kong, Tianlong [1 ]
Yin, Shouyi [1 ]
Ouyang, Peng [2 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] AI Lab, Lenovo Res, Beijing, Peoples R China
关键词
Speech separation; speech enhancement; cock-tail party problem; reverberation;
D O I
10.1109/ASRU51503.2021.9688217
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real noisy and reverberant environments, the performance of current single channel speech separation algorithms decreases significantly. Given this situation, this paper proposes a novel speech separation framework, called Graph convolution and Leading global Multi-scale separation network (GLMSnet). The graph convolution network (GCN) is introduced on high-level features for modeling global context and incorporating long-range information, and it can be arbitrarily inserted into the desired position. Furthermore, Global multi-scale convolution is proposed to aggregate different levels features and improve the audio quality of separation. The leading factor is applied to increase valid information of target speech. We evaluate our method on WHAMR! Database. The results show that our proposed method can obtain state-of-the-art speech separation effect in the presence of noise and reverberation. Compared with the most advanced model before, the performance is improved by 22.7%.
引用
收藏
页码:663 / 670
页数:8
相关论文
共 50 条
  • [1] WHAMR!: NOISY AND REVERBERANT SINGLE-CHANNEL SPEECH SEPARATION
    Maciejewski, Matthew
    Wichern, Gordon
    McQuinn, Emmett
    Le Roux, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 696 - 700
  • [2] Self-Attention for Multi-Channel Speech Separation in Noisy and Reverberant Environments
    Liu, Conggui
    Sato, Yoshinao
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 794 - 799
  • [3] SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments
    Wang, Liusong
    Gao, Yuan
    Cao, Kaimin
    Hu, Ying
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 44 - 54
  • [4] Speech Emotion Recognition in Noisy and Reverberant Environments
    Heracleous, Panikos
    Yasuda, Keiji
    Sugaya, Fumiaki
    Yoneyama, Akio
    Hashimoto, Masayuki
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
  • [5] Enhancement of Reverberant Speech in Noisy Acoustical Environments
    Joorabchi, Marjan
    Ghorshi, Seyed
    Sarafnia, Ali
    2014 SIXTH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2014,
  • [6] Chinese speech intelligibility of children in noisy and reverberant environments
    Peng, Jianxin
    Wu, Shengju
    INDOOR AND BUILT ENVIRONMENT, 2018, 27 (10) : 1357 - 1363
  • [7] An Investigation into Audiovisual Speech Correlation in Reverberant Noisy Environments
    Cifani, Simone
    Abel, Andrew
    Hussain, Amir
    Squartini, Stefano
    Piazza, Francesco
    CROSS-MODAL ANALYSIS OF SPEECH, GESTURES, GAZE AND FACIAL EXPRESSIONS, 2009, 5641 : 331 - +
  • [8] TDOA ESTIMATION OF SPEECH SOURCE IN NOISY REVERBERANT ENVIRONMENTS
    Bu, Suliang
    Zhao, Tuo
    Zhao, Yunxin
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1059 - 1066
  • [9] Humanoid separation of speech sources in reverberant environments
    Schulz, Sylvia
    Herfet, Thorsten
    2008 3RD INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING, VOLS 1-3, 2008, : 377 - 382
  • [10] A systematic study of DNN based speech enhancement in reverberant and reverberant-noisy environments
    Wang, Heming
    Pandey, Ashutosh
    Wang, Deliang
    COMPUTER SPEECH AND LANGUAGE, 2025, 89