Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement

被引:0
|
作者
Zhang, Wangyou [1 ,2 ]
Saijo, Kohei [3 ]
Jung, Jee-weon [2 ]
Li, Chenda [1 ,2 ]
Watanabe, Shinji [2 ]
Qiani, Yanmin [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Waseda Univ, Tokyo, Japan
来源
基金
美国国家科学基金会;
关键词
speech enhancement; scalability; robustness; generalizability;
D O I
10.21437/Interspeech.2024-1266
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning-based speech enhancement (SE) models have achieved impressive performance in the past decade. Numerous advanced architectures have been designed to deliver state-of-the-art performance; however, their scalability potential remains unrevealed. Meanwhile, the majority of research focuses on small-sized datasets with restricted diversity, leading to a plateau in performance improvement. In this paper, we aim to provide new insights for addressing the above issues by exploring the scalability of SE models in terms of architectures, model sizes, compute budgets, and dataset sizes. Our investigation involves several popular SE architectures and speech data from different domains. Experiments reveal both similarities and distinctions between the scaling effects in SE and other tasks such as speech recognition. These findings further provide insights into the under-explored SE directions, e.g., larger-scale multi-domain corpora and efficiently scalable architectures.
引用
收藏
页码:1740 / 1744
页数:5
相关论文
共 50 条
  • [1] A novel approach for scalability and performance enhancement in JADE
    Zerrougui S.
    Mokhati F.
    Badri M.
    Babahenini M.C.
    Zerrougui, Salim (zerrougui_salim@hotmail.com), 2017, IOS Press BV (13) : 177 - 201
  • [2] A comprehensive study and review of tuning the performance on database scalability in big data analytics
    Sundarakumar, M. R.
    Mahadevan, G.
    Natchadalingam, R.
    Karthikeyan, G.
    Ashok, J.
    Manoharan, J. Samuel
    Sathya, V
    Velmurugadass, P.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (03) : 5231 - 5255
  • [3] Performance Enhancement of Hierarchical Document Signature: A Comprehensive Study
    Manna, Sukanya
    Gedeon, Tom
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1874 - 1881
  • [4] MOSFET Performance and Scalability Enhancement by Insertion of Oxygen Layers
    Xu, N.
    Damrongplasit, N.
    Takeuchi, H.
    Stephenson, R. J.
    Cody, N. W.
    Yiptong, A.
    Huang, X.
    Hytha, M.
    Mears, R. J.
    Liu, Tsu-Jae King
    2012 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2012,
  • [5] Comprehensive Review of Various Speech Enhancement Techniques
    Gulati, Savy
    COMPUTATIONAL VISION AND BIO-INSPIRED COMPUTING, 2020, 1108 : 536 - 540
  • [6] A Comparative Study of Performance of Different Window Functions for Speech Enhancement
    Verma, A. R.
    Singh, R. K.
    Kumar, A.
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2012), 2014, 236 : 993 - 1002
  • [7] PERSONALIZED SPEECH ENHANCEMENT: NEW MODELS AND COMPREHENSIVE EVALUATION
    Eskimez, Sefik Emre
    Yoshioka, Takuya
    Wang, Huaming
    Wang, Xiaofei
    Chen, Zhuo
    Huang, Xuedong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 356 - 360
  • [8] SDN Controllers Scalability and Performance Study
    Diaz Tello, Alvaro Mauricio
    Abolhasan, Mehran
    2019 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2019,
  • [9] Towards better performance for Speech Enhancement
    Mergu, Rohini R.
    Dixit, Shantanu K.
    2015 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, SIGNALS, COMMUNICATION AND OPTIMIZATION (EESCO), 2015,
  • [10] OSPF performance measurements and scalability study
    Bi, Jun
    Leng, Xiaoxiang
    Wu, Hanping
    2006 IFIP INTERNATIONAL CONFERENCE ON WIRELESS AND OPTICAL COMMUNICATIONS NETWORKS, 2006, : 90 - +