Multi-View Attention Transfer for Efficient Speech Enhancement

被引:3
|
作者
Shin, Wooseok [1 ]
Park, Hyun Joon [1 ]
Kim, Jin Sob [1 ]
Lee, Byung Hoon [1 ]
Han, Sung Won [1 ]
机构
[1] Korea Univ, Sch Ind & Management Engn, Seoul, South Korea
来源
关键词
speech enhancement; multi-view knowledge distillation; feature distillation; time domain; low complexity;
D O I
10.21437/Interspeech.2022-10251
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recent deep learning models have achieved high performance in speech enhancement; however, it is still challenging to obtain a fast and low-complexity model without significant performance degradation. Previous knowledge distillation studies on speech enhancement could not solve this problem because their output distillation methods do not fit the speech enhancement task in some aspects. In this study, we propose multi-view attention transfer (MV-AT), a feature-based distillation, to obtain efficient speech enhancement models in the time domain. Based on the multi-view features extraction model, MV-AT transfers multi-view knowledge of the teacher network to the student network without additional parameters. The experimental results show that the proposed method consistently improved the performance of student models of various sizes on the Valentini and deep noise suppression (DNS) datasets. MANNER-S-8.1GF with our proposed method, a lightweight model for efficient deployment, achieved 15.4 x and 4.71 x fewer parameters and floating-point operations (FLOPS), respectively, compared to the baseline model with similar performance.
引用
收藏
页码:1198 / 1202
页数:5
相关论文
共 50 条
  • [21] QUALITY ENHANCEMENT OF THE MULTI-VIEW EXHIBITION SYSTEM
    Hu, Chunjia
    Zhai, Guangtao
    Gao, Zhongpai
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [22] Multi-view Stereo Network with Attention Thin Volume
    Wan, Zihang
    Xu, Chao
    Hu, Jing
    Xiao, Jian
    Meng, Zhaopeng
    Chen, Jitai
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 410 - 423
  • [23] MANNER: MULTI-VIEW ATTENTION NETWORK FOR NOISE ERASURE
    Park, Hyun Joon
    Ha Kang, Byung
    Shin, Wooseok
    Kim, Jin Sob
    Han, Sung Won
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7842 - 7846
  • [24] Multi-View Guided Multi-View Stereo
    Poggi, Matteo
    Conti, Andrea
    Mattoccia, Stefano
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 8391 - 8398
  • [25] Multi-view graph convolutional networks with attention mechanism
    Yao, Kaixuan
    Liang, Jiye
    Liang, Jianqing
    Li, Ming
    Cao, Feilong
    ARTIFICIAL INTELLIGENCE, 2022, 307
  • [26] Multi-view Attention Networks for Visual Question Answering
    Li, Min
    Bai, Zongwen
    Deng, Jie
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 788 - 794
  • [27] Multi-view Graph Attention Network for Travel Recommendation
    Chen, Lei
    Cao, Jie
    Wang, Youquan
    Liang, Weichao
    Zhu, Guixiang
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191
  • [28] Multi-view Mixed Attention for Contrastive Learning on Hypergraphs
    Lee, Jongsoo
    Chae, Dong-Kyu
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2543 - 2547
  • [29] Multi-view Attention with Memory Assistant for Image Captioning
    Fu, You
    Fang, Siyu
    Wang, Rui
    Yi, Xiulong
    Yu, Jianzhi
    Hua, Rong
    2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 436 - 440
  • [30] Action Recognition with a Multi-View Temporal Attention Network
    Dengdi Sun
    Zhixiang Su
    Zhuanlian Ding
    Bin Luo
    Cognitive Computation, 2022, 14 : 1082 - 1095