Multi-task Learning for End-to-end Noise-robust Bandwidth Extension

被引:11
|
作者
Hou, Nana [1 ]
Xu, Chenglin [1 ,4 ]
Zhou, Joey Tianyi [3 ]
Chng, Eng Siong [1 ,2 ]
Li, Haizhou [4 ,5 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[2] Nanyang Technol Univ, Temasek Labs, Singapore, Singapore
[3] ASTAR, Inst High Performance Comp IHPC, Singapore, Singapore
[4] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[5] Univ Bremen, Machine Listening Lab, Bremen, Germany
来源
基金
新加坡国家研究基金会;
关键词
Noise-robust bandwidth extension; multi-task learning; time-domain masking; temporal convolutional network; NEURAL-NETWORK; SPEECH;
D O I
10.21437/Interspeech.2020-2022
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Bandwidth extension aims to reconstruct wideband speech signals from narrowband inputs to improve perceptual quality. Prior studies mostly perform bandwidth extension under the assumption that the narrowband signals are clean without noise. The use of such extension techniques is greatly limited in practice when signals are corrupted by noise. To alleviate such problem, we propose an end-to-end time-domain framework for noise-robust bandwidth extension, that jointly optimizes a mask-based speech enhancement and an ideal bandwidth extension module with multi-task learning. The proposed framework avoids decomposing the signals into magnitude and phase spectra, therefore, requires no phase estimation. Experimental results show that the proposed method achieves 14.3% and 15.8% relative improvements over the best baseline in terms of perceptual evaluation of speech quality (PESQ) and log-spectral distortion (LSD), respectively. Furthermore, our method is 3 times more compact than the best baseline in terms of the number of parameters.
引用
收藏
页码:4069 / 4073
页数:5
相关论文
共 50 条
  • [41] Multi-task learning of classification and denoising (MLCD) for noise-robust rotor system diagnosis
    Ko, Jin Uk
    Jung, Joon Ha
    Kim, Myungyon
    Kong, Hyeon Bae
    Lee, Jinwook
    Youn, Byeng D.
    COMPUTERS IN INDUSTRY, 2021, 125
  • [42] Multi-Task End-to-End Model for Telugu Dialect and Speech Recognition
    Yadavalli, Aditya
    Mirishkar, Ganesh S.
    Vuppala, Anil Kumar
    INTERSPEECH 2022, 2022, : 1387 - 1391
  • [43] Noise-Robust End-to-End Quantum Control using Deep Autoregressive Policy Networks
    Yao, Jiahao
    Kottering, Paul
    Gundlach, Hans
    Lin, Lin
    Bukov, Marin
    MATHEMATICAL AND SCIENTIFIC MACHINE LEARNING, VOL 145, 2021, 145 : 1044 - 1081
  • [44] End-to-end multi-task optimization model for task-based dialogue systems
    Zhao F.
    Qiu M.
    Li X.
    Sun Y.
    Yang Z.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2023, 29 (11): : 3592 - 3599
  • [45] End-to-End Multi-task Learning Architecture for Brain Tumor Analysis with Uncertainty Estimation in MRI Images
    Nazir, Maria
    Shakil, Sadia
    Khurshid, Khurram
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024, 37 (05): : 2149 - 2172
  • [46] Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning
    Hou, Wenxin
    Dong, Yue
    Zhuang, Bairong
    Yang, Longfei
    Shi, Jiatong
    Shinozaki, Takahiro
    INTERSPEECH 2020, 2020, : 1037 - 1041
  • [47] End-to-end multi-task learning approaches for the joint epiretinal membrane segmentation and screening in OCT images
    Gende, Mateo
    de Moura, Joaquim
    Novo, Jorge
    Ortega, Marcos
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2022, 98
  • [48] END-TO-END MULTI-TASK MACHINE LEARNING OF VEHICLE DYNAMICS FOR STEERING ANGLE PREDICTION FOR AUTONOMOUS DRIVING
    Merrill, Nicholas
    Eskandarian, Azim
    PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2019, VOL 3, 2020,
  • [49] JOINT CTC-ATTENTION BASED END-TO-END SPEECH RECOGNITION USING MULTI-TASK LEARNING
    Kim, Suyoun
    Hori, Takaaki
    Watanabe, Shinji
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4835 - 4839
  • [50] End-to-end Multi-task Learning of Missing Value Imputation and Forecasting in Time-Series Data
    Kim, Jinhee
    Kim, Taesung
    Choi, Jang-Ho
    Choo, Jaegul
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8849 - 8856