A Low Computation Cost Model for Real-Time Speech Enhancement

被引:0
|
作者
Wang, Qirui [1 ]
Zhou, Lin [1 ]
Cao, Yanxiang [1 ]
Zhuang, Chenghao [1 ]
Cheng, Yunling [1 ]
Deng, Yuxi [1 ]
机构
[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing, Peoples R China
关键词
conformer; low computation cost; real-time; speech enhancement;
D O I
10.1109/ICCCAS62034.2024.10652686
中图分类号
学科分类号
摘要
Developing speech enhancement systems for real-time scenarios has been a challenge due to the need for low computation complexity, parallel processing, and a causal structure. In this paper, we propose a speech enhancement model that works on time-frequency domain with all operations being 1D-dimensional to reduce computation cost. Specifically, the proposed model follows a U-Net structure with several conformer blocks inserted. Our evaluation on DNS Challenge and VoiceBank + DEMAND benchmarks shows that our model performs comparably to other state-of-the-art causal systems. Most importantly, the proposed model only needs 0.70G MACs when processing 16000 samples (1 second) speech signal and achieves an RTF (Real Time Factor) of 0.012, thus indicating that the model significantly reduces the computational cost.
引用
收藏
页码:267 / 271
页数:5
相关论文
共 50 条
  • [1] An algorithmic model for real-time computation
    Akl, SG
    SCCC 2003: XXIII INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY, PROCEEDINGS, 2003, : 31 - 38
  • [2] Real-time Speech Enhancement with GCC-NMF
    Wood, Sean U. N.
    Rouat, Jean
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2665 - 2669
  • [3] REAL-TIME SPEECH ENHANCEMENT USING EQUILIBRIATED RNN
    Takeuchi, Daiki
    Yatabe, Kohei
    Koizumi, Yuma
    Oikawa, Yasuhiro
    Harada, Noboru
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 851 - 855
  • [4] DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
    Schroeter, Hendrik
    Escalante-B, Alberto N.
    Rosenkranz, Tobias
    Maier, Andreas
    INTERSPEECH 2023, 2023, : 2008 - 2009
  • [5] Real-Time Contrast Enhancement to Improve Speech Recognition
    Alexander, Joshua M.
    Jenison, Rick L.
    Kluender, Keith R.
    PLOS ONE, 2011, 6 (09):
  • [6] A Markovian model for the computation time of real-time applications
    Abeni, Luca
    Fontanelli, Daniele
    Palopoli, Luigi
    Frias, Bernardo Villalba
    2017 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE (I2MTC), 2017, : 1369 - 1374
  • [7] A Model of Parallel Deterministic Real-Time Computation
    Lemerre, Matthieu
    Ohayon, Emmanuel
    PROCEEDINGS OF THE 2012 IEEE 33RD REAL-TIME SYSTEMS SYMPOSIUM (RTSS), 2012, : 273 - 282
  • [8] A Formal Model For Real-Time Parallel Computation
    Hui, Peter
    Chikkagoudar, Satish
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2012, (105): : 39 - 55
  • [9] Lite-RTSE: Exploring a Cost-Effective Lite DNN Model for Real-Time Speech Enhancement in RTC Scenarios
    Liang, Xingwei
    Zhang, Lu
    Wu, Zhiyong
    Xu, Ruifeng
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1697 - 1701
  • [10] Lightweight Real-Time Recurrent Models for Speech Enhancement and Automatic Speech Recognition
    Dhahbi, Sami
    Saleem, Nasir
    Gunawan, Teddy Surya
    Bourouis, Sami
    Ali, Imad
    Trigui, Aymen
    Algarni, Abeer D.
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2024, 8 (06): : 74 - 85