Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement

被引:0
|
作者
Yang, Hejung [1 ]
Kang, Hong-Goo [1 ]
机构
[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea
来源
关键词
speech enhancement; self-supervised model; feature normalization; REPRESENTATION;
D O I
10.21437/Interspeech.2023-623
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large, pre-trained representation models trained using self-supervised learning have gained popularity in various fields of machine learning because they are able to extract high-quality salient features from input data. As such, they have been frequently used as base networks for various pattern classification tasks such as speech recognition. However, not much research has been conducted on applying these types of models to the field of speech signal generation. In this paper, we investigate the feasibility of using pre-trained speech representation models for a downstream speech enhancement task. To alleviate mismatches between the input features of the pre-trained model and the target enhancement model, we adopt a novel feature normalization technique to smoothly link these modules together. Our proposed method enables significant improvements in speech quality compared to baselines when combined with various types of pre-trained speech models.
引用
收藏
页码:814 / 818
页数:5
相关论文
共 50 条
  • [1] Improving fine-tuning of self-supervised models with Contrastive Initialization
    Pan, Haolin
    Guo, Yong
    Deng, Qinyi
    Yang, Haomin
    Chen, Jian
    Chen, Yiqun
    NEURAL NETWORKS, 2023, 159 : 198 - 207
  • [2] FINE-TUNING STRATEGIES FOR FASTER INFERENCE USING SPEECH SELF-SUPERVISED MODELS: A COMPARATIVE STUDY
    Zaiem, Salah
    Algayres, Robin
    Parcollet, Titouan
    Essid, Slim
    Ravanelli, Mirco
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [3] ON FINE-TUNING PRE-TRAINED SPEECH MODELS WITH EMA-TARGET SELF-SUPERVISED LOSS
    Yang, Hejung
    Kang, Hong-Goo
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6360 - 6364
  • [4] Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations
    Zaiem, Salah
    Parcollet, Titouan
    Essid, Slim
    INTERSPEECH 2023, 2023, : 67 - 71
  • [5] Fine-Tuning Self-Supervised Learning Models for End-to-End Pronunciation Scoring
    Zahran, Ahmed I.
    Fahmy, Aly A.
    Wassif, Khaled T.
    Bayomi, Hanaa
    IEEE ACCESS, 2023, 11 : 112650 - 112663
  • [6] Self-Supervised Fine-Tuning of Automatic Speech Recognition Systems against Signal Processing Attacks
    Jayawardena, Oshan
    Caldera, Dilmi
    Jayawardena, Sandani
    Sandeepa, Avishka
    Bindschaedler, Vincent
    Charles, Subodha
    PROCEEDINGS OF THE 19TH ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, ACM ASIACCS 2024, 2024, : 1272 - 1286
  • [7] Kaizen: Practical self-supervised continual learning with continual fine-tuning
    Tang, Chi Ian
    Qendrol, Lorena
    Spathis, Dimitris
    Kawsar, Fahim
    Mascolo, Cecilia
    Mathur, Akhil
    2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024, 2024, : 2829 - 2838
  • [8] Self-supervised Fine-tuning for Efficient Passage Re-ranking
    Kim, Meoungjun
    Ko, Youngjoong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3142 - 3146
  • [9] Self-Supervised Learning With Data-Efficient Supervised Fine-Tuning for Crowd Counting
    Wang, Rui
    Hao, Yixue
    Hu, Long
    Chen, Jincai
    Chen, Min
    Wu, Di
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1538 - 1546
  • [10] Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning
    Zhang, Yifan
    Hooi, Bryan
    Hu, Dapeng
    Liang, Jian
    Feng, Jiashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,