Zero-shot test time adaptation via knowledge distillation for personalized speech denoising and dereverberation

被引:2
|
作者
Kim, Sunwoo [1 ]
Athi, Mrudula [1 ]
Shi, Guangji [1 ]
Kim, Minje [1 ,2 ]
Kristjansson, Trausti [1 ]
机构
[1] Amazon Lab126, Sunnyvale, CA 94089 USA
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
来源
基金
美国国家科学基金会;
关键词
DOMAIN ADAPTATION; ENHANCEMENT; NOISE;
D O I
10.1121/10.0024621
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A personalization framework to adapt compact models to test time environments and improve their speech enhancement (SE) performance in noisy and reverberant conditions is proposed. The use-cases are when the end-user device encounters only one or a few speakers and noise types that tend to reoccur in the specific acoustic environment. Hence, a small personalized model that is sufficient to handle this focused subset of the original universal SE problem is postulated. The study addresses a major data shortage issue: although the goal is to learn from a specific user's speech signals and the test time environment, the target clean speech is unavailable for model training due to privacy-related concerns and technical difficulty of recording noise and reverberation-free voice signals. The proposed zero-shot personalization method uses no clean speech target. Instead, it employs the knowledge distillation framework, where the more advanced denoising results from an overly large teacher work as pseudo targets to train a small student model. Evaluation on various test time conditions suggests that the proposed personalization approach can significantly enhance the compact student model's test time performance. Personalized models outperform larger non-personalized baseline models, demonstrating that personalization achieves model compression with no loss in dereverberation and denoising performance.
引用
收藏
页码:1353 / 1367
页数:15
相关论文
共 50 条
  • [21] Zero-shot Knowledge Transfer via Adversarial Belief Matching
    Micaelli, Paul
    Storkey, Amos
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [22] Generalized Zero-shot Intent Detection via Commonsense Knowledge
    Siddique, A. B.
    Jamour, Fuad
    Xu, Luxun
    Hristidis, Vagelis
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1925 - 1929
  • [23] Adversarial Distillation Adaptation Model with Sentiment Contrastive Learning for Zero-Shot Stance Detection
    Zhang, Yu
    Wang, Chunling
    Wang, Jia
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2023, 16 (01)
  • [24] Adversarial Distillation Adaptation Model with Sentiment Contrastive Learning for Zero-Shot Stance Detection
    Yu Zhang
    Chunling Wang
    Jia Wang
    International Journal of Computational Intelligence Systems, 16
  • [25] ZERO-SHOT PERSONALIZED SPEECH ENHANCEMENT THROUGH SPEAKER-INFORMED MODEL SELECTION
    Sivaraman, Aswin
    Kim, Minje
    2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2021, : 171 - 175
  • [26] Test-Time Zero-Shot Temporal Action Localization
    Liberatori, Benedetta
    Conti, Alessandro
    Rota, Paolo
    Wang, Yiming
    Ricci, Elisa
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 18720 - 18729
  • [27] Relationship-Preserving Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval
    Tian, Jialin
    Xu, Xing
    Wang, Zheng
    Shen, Fumin
    Liu, Xin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5473 - 5481
  • [28] Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero-shot Medical Image Segmentation
    Aleem, Sidra
    Wang, Fangyijie
    Maniparambil, Mayug
    Arazo, Eric
    Dietlmeier, Julia
    Curran, Kathleen
    O'Connor, Noel E.
    Little, Suzanne
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2024, : 5184 - 5193
  • [29] Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels
    Wan, Bo
    Tuytelaars, Tinne
    2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024, 2024, : 1794 - 1804
  • [30] Zero-Shot Learning via Contrastive Learning on Dual Knowledge Graphs
    Wang, Jin
    Jiang, Bo
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 885 - 892