Multi-task Learning for End-to-end Noise-robust Bandwidth Extension

被引:11
|
作者
Hou, Nana [1 ]
Xu, Chenglin [1 ,4 ]
Zhou, Joey Tianyi [3 ]
Chng, Eng Siong [1 ,2 ]
Li, Haizhou [4 ,5 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[2] Nanyang Technol Univ, Temasek Labs, Singapore, Singapore
[3] ASTAR, Inst High Performance Comp IHPC, Singapore, Singapore
[4] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[5] Univ Bremen, Machine Listening Lab, Bremen, Germany
来源
基金
新加坡国家研究基金会;
关键词
Noise-robust bandwidth extension; multi-task learning; time-domain masking; temporal convolutional network; NEURAL-NETWORK; SPEECH;
D O I
10.21437/Interspeech.2020-2022
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Bandwidth extension aims to reconstruct wideband speech signals from narrowband inputs to improve perceptual quality. Prior studies mostly perform bandwidth extension under the assumption that the narrowband signals are clean without noise. The use of such extension techniques is greatly limited in practice when signals are corrupted by noise. To alleviate such problem, we propose an end-to-end time-domain framework for noise-robust bandwidth extension, that jointly optimizes a mask-based speech enhancement and an ideal bandwidth extension module with multi-task learning. The proposed framework avoids decomposing the signals into magnitude and phase spectra, therefore, requires no phase estimation. Experimental results show that the proposed method achieves 14.3% and 15.8% relative improvements over the best baseline in terms of perceptual evaluation of speech quality (PESQ) and log-spectral distortion (LSD), respectively. Furthermore, our method is 3 times more compact than the best baseline in terms of the number of parameters.
引用
收藏
页码:4069 / 4073
页数:5
相关论文
共 50 条
  • [1] End-to-End Multi-Task Learning with Attention
    Liu, Shikun
    Johns, Edward
    Davison, Andrew J.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1871 - 1880
  • [2] Noise-robust Attention Learning for End-to-End Speech Recognition
    Higuchi, Yosuke
    Tawara, Naohiro
    Ogawa, Atsunori
    Iwata, Tomoharu
    Kobayashi, Tetsunori
    Ogawa, Tetsuji
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 311 - 315
  • [3] End-to-End Deep Learning for Phase Noise-Robust Multi-Dimensional Geometric Shaping
    Talreja, Veeru
    Koike-Akino, Toshiaki
    Wang, Ye
    Millar, David S.
    Kojima, Keisuke
    Parsons, Kieran
    2020 EUROPEAN CONFERENCE ON OPTICAL COMMUNICATIONS (ECOC), 2020,
  • [4] Multi-task Learning with Attention for End-to-end Autonomous Driving
    Ishihara, Keishi
    Kanervisto, Anssi
    Miura, Jun
    Hautamaki, Ville
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2896 - 2905
  • [5] Adversarial Multi-task Learning for End-to-end Metaphor Detection
    Zhang, Shenglong
    Liu, Ying
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1483 - 1497
  • [6] Adversarial Multi-Task Learning for Robust End-to-End ECG-based Heartbeat Classification
    Shahin, Mostafa
    Oo, Ethan
    Ahmed, Beena
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 341 - 344
  • [7] An End-to-End Scalable Iterative Sequence Tagging with Multi-Task Learning
    Gui, Lin
    Du, Jiachen
    Zhao, Zhishan
    He, Yulan
    Xu, Ruifeng
    Fan, Chuang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2018, PT II, 2018, 11109 : 288 - 298
  • [8] Hybrid Multi-Task Learning for End-To-End Multimodal Emotion Recognition
    Chen, Junjie
    Li, Yongwei
    Zhao, Ziping
    Liu, Xuefei
    Wen, Zhengqi
    Tao, Jianhua
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1966 - 1971
  • [9] Rethinking and Improving Multi-task Learning for End-to-end Speech Translation
    Zhang, Yuhao
    Xu, Chen
    Li, Bei
    Chen, Hao
    Xiao, Tong
    Zhang, Chunliang
    Zhu, Jingbo
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 10753 - 10765
  • [10] End-to-End Multi-Task Learning for Lung Nodule Segmentation and Diagnosis
    Chen, Wei
    Wang, Qiuli
    Yang, Dan
    Zhang, Xiaohong
    Liu, Chen
    Li, Yucong
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6710 - 6717