Multi-View Attention Transfer for Efficient Speech Enhancement

被引:3
|
作者
Shin, Wooseok [1 ]
Park, Hyun Joon [1 ]
Kim, Jin Sob [1 ]
Lee, Byung Hoon [1 ]
Han, Sung Won [1 ]
机构
[1] Korea Univ, Sch Ind & Management Engn, Seoul, South Korea
来源
关键词
speech enhancement; multi-view knowledge distillation; feature distillation; time domain; low complexity;
D O I
10.21437/Interspeech.2022-10251
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recent deep learning models have achieved high performance in speech enhancement; however, it is still challenging to obtain a fast and low-complexity model without significant performance degradation. Previous knowledge distillation studies on speech enhancement could not solve this problem because their output distillation methods do not fit the speech enhancement task in some aspects. In this study, we propose multi-view attention transfer (MV-AT), a feature-based distillation, to obtain efficient speech enhancement models in the time domain. Based on the multi-view features extraction model, MV-AT transfers multi-view knowledge of the teacher network to the student network without additional parameters. The experimental results show that the proposed method consistently improved the performance of student models of various sizes on the Valentini and deep noise suppression (DNS) datasets. MANNER-S-8.1GF with our proposed method, a lightweight model for efficient deployment, achieved 15.4 x and 4.71 x fewer parameters and floating-point operations (FLOPS), respectively, compared to the baseline model with similar performance.
引用
收藏
页码:1198 / 1202
页数:5
相关论文
共 50 条
  • [31] Monocular depth estimation with multi-view attention autoencoder
    Jung, Geunho
    Yoon, Sang Min
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (23) : 33759 - 33770
  • [32] Multi-view Instance Attention Fusion Network for classification
    Li, Jinxing
    Zhou, Chuhao
    Ji, Xiaoqiang
    Li, Mu
    Lu, Guangming
    Xu, Yong
    Zhang, David
    INFORMATION FUSION, 2024, 101
  • [33] Multi-view Stereo Network with Attention Thin Volume
    Wan, Zihang
    Xu, Chao
    Hu, Jing
    Xiao, Jian
    Meng, Zhaopeng
    Chen, Jitai
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, 13631 LNCS : 410 - 423
  • [34] MATE: Multi-view Attention for Table Transformer Efficiency
    Eisenschlos, Julian Martin
    Gor, Maharshi
    Mueller, Thomas
    Cohen, William W.
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 7606 - 7619
  • [35] Monocular depth estimation with multi-view attention autoencoder
    Geunho Jung
    Sang Min Yoon
    Multimedia Tools and Applications, 2022, 81 : 33759 - 33770
  • [36] Action Recognition with a Multi-View Temporal Attention Network
    Sun, Dengdi
    Su, Zhixiang
    Ding, Zhuanlian
    Luo, Bin
    COGNITIVE COMPUTATION, 2022, 14 (03) : 1082 - 1095
  • [37] Efficient Graph Based Multi-view Learning
    Hu, Hengtong
    Hong, Richang
    Fu, Weijie
    Wang, Meng
    MULTIMEDIA MODELING (MMM 2019), PT I, 2019, 11295 : 691 - 703
  • [38] MEDEA: MULTI-VIEW EFFICIENT DEPTH ADJUSTMENT
    Artemyev, Mikhail
    Vorontsova, Anna
    Sokolova, Anna
    Limonov, Alexander
    2024 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2024, : 2243 - 2249
  • [39] Efficient representation and compression of multi-view images
    Park, JI
    Yang, KH
    Iwadate, Y
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2000, E83D (12) : 2186 - 2188
  • [40] Efficient Orthogonal Multi-view Subspace Clustering
    Chen, Man-Sheng
    Wang, Chang-Dong
    Huang, Dong
    Lai, Jian-Huang
    Yu, Philip S.
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 127 - 135