A Low Computation Cost Model for Real-Time Speech Enhancement

被引：0

作者：

Wang, Qirui ^{[1
]}

Zhou, Lin ^{[1
]}

Cao, Yanxiang ^{[1
]}

Zhuang, Chenghao ^{[1
]}

Cheng, Yunling ^{[1
]}

Deng, Yuxi ^{[1
]}

机构：

[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing, Peoples R China

来源：

2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024 | 2024年

关键词：

conformer; low computation cost; real-time; speech enhancement;

D O I：

10.1109/ICCCAS62034.2024.10652686

中图分类号：

学科分类号：

摘要：

Developing speech enhancement systems for real-time scenarios has been a challenge due to the need for low computation complexity, parallel processing, and a causal structure. In this paper, we propose a speech enhancement model that works on time-frequency domain with all operations being 1D-dimensional to reduce computation cost. Specifically, the proposed model follows a U-Net structure with several conformer blocks inserted. Our evaluation on DNS Challenge and VoiceBank + DEMAND benchmarks shows that our model performs comparably to other state-of-the-art causal systems. Most importantly, the proposed model only needs 0.70G MACs when processing 16000 samples (1 second) speech signal and achieves an RTF (Real Time Factor) of 0.012, thus indicating that the model significantly reduces the computational cost.

引用

页码：267 / 271

页数：5

共 50 条

[1] An algorithmic model for real-time computation
Akl, SG
SCCC 2003: XXIII INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY, PROCEEDINGS, 2003, : 31 - 38
[2] Real-time Speech Enhancement with GCC-NMF
Wood, Sean U. N.
Rouat, Jean
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2665 - 2669
[3] REAL-TIME SPEECH ENHANCEMENT USING EQUILIBRIATED RNN
Takeuchi, Daiki
Yatabe, Kohei
Koizumi, Yuma
Oikawa, Yasuhiro
Harada, Noboru
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 851 - 855
[4] DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
Schroeter, Hendrik
Escalante-B, Alberto N.
Rosenkranz, Tobias
Maier, Andreas
INTERSPEECH 2023, 2023, : 2008 - 2009
[5] Real-Time Contrast Enhancement to Improve Speech Recognition
Alexander, Joshua M.
Jenison, Rick L.
Kluender, Keith R.
PLOS ONE, 2011, 6 (09):
[6] A Markovian model for the computation time of real-time applications
Abeni, Luca
Fontanelli, Daniele
Palopoli, Luigi
Frias, Bernardo Villalba
2017 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE (I2MTC), 2017, : 1369 - 1374
[7] A Model of Parallel Deterministic Real-Time Computation
Lemerre, Matthieu
Ohayon, Emmanuel
PROCEEDINGS OF THE 2012 IEEE 33RD REAL-TIME SYSTEMS SYMPOSIUM (RTSS), 2012, : 273 - 282
[8] A Formal Model For Real-Time Parallel Computation
Hui, Peter
Chikkagoudar, Satish
ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2012, (105): : 39 - 55
[9] Lite-RTSE: Exploring a Cost-Effective Lite DNN Model for Real-Time Speech Enhancement in RTC Scenarios
Liang, Xingwei
Zhang, Lu
Wu, Zhiyong
Xu, Ruifeng
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1697 - 1701
[10] Lightweight Real-Time Recurrent Models for Speech Enhancement and Automatic Speech Recognition
Dhahbi, Sami
Saleem, Nasir
Gunawan, Teddy Surya
Bourouis, Sami
Ali, Imad
Trigui, Aymen
Algarni, Abeer D.
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2024, 8 (06): : 74 - 85

← 1 2 3 4 5 →