Zero-shot test time adaptation via knowledge distillation for personalized speech denoising and dereverberation

被引:2
|
作者
Kim, Sunwoo [1 ]
Athi, Mrudula [1 ]
Shi, Guangji [1 ]
Kim, Minje [1 ,2 ]
Kristjansson, Trausti [1 ]
机构
[1] Amazon Lab126, Sunnyvale, CA 94089 USA
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
来源
基金
美国国家科学基金会;
关键词
DOMAIN ADAPTATION; ENHANCEMENT; NOISE;
D O I
10.1121/10.0024621
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A personalization framework to adapt compact models to test time environments and improve their speech enhancement (SE) performance in noisy and reverberant conditions is proposed. The use-cases are when the end-user device encounters only one or a few speakers and noise types that tend to reoccur in the specific acoustic environment. Hence, a small personalized model that is sufficient to handle this focused subset of the original universal SE problem is postulated. The study addresses a major data shortage issue: although the goal is to learn from a specific user's speech signals and the test time environment, the target clean speech is unavailable for model training due to privacy-related concerns and technical difficulty of recording noise and reverberation-free voice signals. The proposed zero-shot personalization method uses no clean speech target. Instead, it employs the knowledge distillation framework, where the more advanced denoising results from an overly large teacher work as pseudo targets to train a small student model. Evaluation on various test time conditions suggests that the proposed personalization approach can significantly enhance the compact student model's test time performance. Personalized models outperform larger non-personalized baseline models, demonstrating that personalization achieves model compression with no loss in dereverberation and denoising performance.
引用
收藏
页码:1353 / 1367
页数:15
相关论文
共 50 条
  • [31] Enhancing Zero-Shot Stance Detection via Targeted Background Knowledge
    Zhu, Qinglin
    Liang, Bin
    Sun, Jingyi
    Du, Jiachen
    Zhou, Lanjun
    Xu, Ruifeng
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2070 - 2075
  • [32] Optimizing feature fusion for improved zero-shot adaptation in text-to-speech synthesis
    Chen, Zhiyong
    Ai, Zhiqi
    Ma, Youxuan
    Li, Xinnuo
    Xu, Shugong
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01):
  • [33] Diverse Policy Learning via Random Obstacle Deployment for Zero-Shot Adaptation
    Choi, Seokjin
    Lee, Yonghyeon
    Kim, Seungyeon
    Park, Che-Sang
    Hwang, Himchan
    Park, Frank C.
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (03): : 2510 - 2517
  • [34] Generalized zero-shot domain adaptation via coupled conditional variational autoencoders
    Wang, Qian
    Breckon, Toby P.
    NEURAL NETWORKS, 2023, 163 : 40 - 52
  • [35] Zero-Shot Knowledge Distillation from a Decision-Based Black-Box Model
    Wang, Zi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7688 - 7699
  • [36] Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval
    Wang, Kai
    Wang, Yifan
    Xu, Xing
    Liu, Xin
    Ou, Weihua
    Lu, Huimin
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [37] Zero-Shot Knowledge Distillation Using Label-Free Adversarial Perturbation With Taylor Approximation
    Lee, Kang Il
    Lee, Seunghyun
    Song, Byung Cheol
    IEEE ACCESS, 2021, 9 : 45454 - 45461
  • [38] Zero-Shot Modulation Recognition via Knowledge-Informed Waveform Description
    Chen, Ying
    Wang, Xiang
    Huang, Zhitao
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 21 - 25
  • [39] High fidelity zero shot speaker adaptation in text to speech synthesis with denoising diffusion GAN
    Liu, Xiangchun
    Ma, Xuan
    Song, Wei
    Zhang, Yanghao
    Zhang, Yi
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [40] Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
    Guo, Jiayi
    Wang, Chaofei
    Wu, You
    Zhang, Eric
    Wang, Kai
    Xu, Xingqian
    Song, Shiji
    Shi, Humphrey
    Huang, Gao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11494 - 11503