SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments

被引：0

作者：

Wang, Liusong ^{[1
,2
]}

Gao, Yuan ^{[1
,2
]}

Cao, Kaimin ^{[1
,2
]}

Hu, Ying ^{[1
,2
]}

机构：

[1] Xinjiang Univ, Sch Comp Sci & Technol, Urumqi, Peoples R China

[2] Key Lab Signal Detect & Proc Xinjiang, Urumqi, Peoples R China

来源：

MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024 | 2025年 / 2312卷

关键词：

Speech enhancement; Speech separation; Noisy reverberant environment; Former block;

D O I：

10.1007/978-981-96-1045-7_4

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech enhancement and separation in noisy reverberant environments are very challenging tasks. In this paper, we propose a speech enhancement and separation network, SESNet, for speech enhancement or speech separation in noisy reverberant environments, which is a multi-scale encoder-decoder architecture including a global-local feature extractor (GLFE). We also explored four kinds of Former blocks to be equipped in GLFE. We evaluate the performance of speech enhancement and speech separation on the VoiceBank+DEMAND and the WHAMR! datasets. The experimental results show that the SESNet has excellent performance for single- and multi-channel speech enhancement, and single-channel multi-speaker speech separation, keeping with a small model size.

引用

页码：44 / 54

页数：11

共 50 条

[1] Enhancement of Reverberant Speech in Noisy Acoustical Environments
Joorabchi, Marjan
Ghorshi, Seyed
Sarafnia, Ali
2014 SIXTH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2014,
[2] A systematic study of DNN based speech enhancement in reverberant and reverberant-noisy environments
Wang, Heming
Pandey, Ashutosh
Wang, Deliang
COMPUTER SPEECH AND LANGUAGE, 2025, 89
[3] GLMSNET: SINGLE CHANNEL SPEECH SEPARATION FRAMEWORK IN NOISY AND REVERBERANT ENVIRONMENTS
Shi, Huiyu
Chen, Xi
Kong, Tianlong
Yin, Shouyi
Ouyang, Peng
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 663 - 670
[4] Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments
Liu, Yang
Nower, Naushin
Morita, Shota
Unoki, Masashi
SPEECH COMMUNICATION, 2016, 84 : 1 - 14
[5] Speech intelligibility improvement in noisy reverberant environments based on speech enhancement and inverse filtering
Dong, Huan-Yu
Lee, Chang-Myung
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,
[6] Speech intelligibility improvement in noisy reverberant environments based on speech enhancement and inverse filtering
Huan-Yu Dong
Chang-Myung Lee
EURASIP Journal on Audio, Speech, and Music Processing, 2018
[7] Speech Emotion Recognition in Noisy and Reverberant Environments
Heracleous, Panikos
Yasuda, Keiji
Sugaya, Fumiaki
Yoneyama, Akio
Hashimoto, Masayuki
2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
[8] A Blind Source Separation Based Approach for Speech Enhancement in Noisy and Reverberant Environment
Pignotti, Alessio
Marcozzi, Daniele
Cifani, Simone
Squartini, Stefano
Piazza, Francesco
CROSS-MODAL ANALYSIS OF SPEECH, GESTURES, GAZE AND FACIAL EXPRESSIONS, 2009, 5641 : 356 - 367
[9] Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech
Valentini-Botinhao, Cassia
Yamagishi, Junichi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (08) : 1420 - 1433
[10] A PROGRESSIVE ENHANCEMENT METHOD FOR NOISY AND REVERBERANT SPEECH
Shu, Xiaofeng
Zhou, Yi
Cao, Yin
2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,

← 1 2 3 4 5 →