Focal Channel Knowledge Distillation for Multi-Modality Action Recognition

被引:1
|
作者
Gan, Lipeng [1 ]
Cao, Runze [1 ]
Li, Ning [1 ]
Yang, Man [1 ]
Li, Xiaochao [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Dept Microelect & lntegrated Circuit, Xiamen 361005, Peoples R China
[2] Xiamen Univ Malaysia, Dept Elect & Elect Engn, Sepang 43900, Selangor, Malaysia
[3] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
来源
IEEE ACCESS | 2023年 / 11卷
关键词
Action recognition; knowledge distillation; multi-modality;
D O I
10.1109/ACCESS.2023.3298647
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The multi-modality action recognition aims to learn the complementary information from multiple modalities to improve the action recognition performance. However, there exists a significant modality channel difference, the equal transferring channel semantic features from multi-modalities to RGB will result in competition and redundancy during knowledge distillation. To address this issue, we propose a focal channel knowledge distillation strategy to transfer the key semantic correlations and distributions of multi-modality teachers into the RGB student network. The focal channel correlations provide intrinsic relationships and diversity properties of key semantics, and focal channel distributions provide salient channel activation of features. By ignoring the less-discriminative and irrelevant channels, the student can more efficiently utilize the channel capability to learn the complementary semantic features from the other modalities. Our focal channel knowledge distillation achieves 91.2%, 95.6%, 98.3% and 81.0% accuracy with 4.5%, 4.2%, 3.7% and 7.1% improvement on NTU 60 (CS), UTD-MHAD, N-UCLA and HMDB51 datasets comparing to unimodal RGB models. This focal channel knowledge distillation framework can also be integrated with the unimodal models to achieve the state-of-the-art performance. The extensive experiments show that the proposed method achieves 92.5%, 96.0%, 98.9%, and 82.3% accuracy on NTU 60 (CS), UTD-MHAD, N-UCLA, and HMDB51 datasets respectively.
引用
收藏
页码:78285 / 78298
页数:14
相关论文
共 50 条
  • [11] MULTI-MODALITY AMERICAN SIGN LANGUAGE RECOGNITION
    Zhang, Chenyang
    Tian, Yingli
    Huenetfauth, Matt
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 2881 - 2885
  • [12] Cross-modality online distillation for multi-view action recognition
    Xu, Chao
    Wu, Xia
    Li, Yachun
    Jin, Yining
    Wang, Mengmeng
    Liu, Yong
    NEUROCOMPUTING, 2021, 456 : 384 - 393
  • [13] Discriminative Multi-modality Non-negative Sparse Graph Model for Action Recognition
    Chen, Yuanbo
    Zhao, Yanyun
    Zhuang, Bojin
    Cai, Anni
    2014 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING CONFERENCE, 2014, : 53 - 56
  • [14] MULTI-MODALITY ACTION RECOGNITION BASED ON DUAL FEATURE SHIFT IN VEHICLE CABIN MONITORING
    Lin, Dan
    Lee, Philip Hann Yung
    Li, Yiming
    Wang, Ruoyu
    Yap, Kim-Hui
    Li, Bingbing
    Ngim, You Shing
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6480 - 6484
  • [15] Human action recognition using hull convexity defect features with multi-modality setups
    Youssef, M. M.
    Asari, V. K.
    PATTERN RECOGNITION LETTERS, 2013, 34 (15) : 1971 - 1979
  • [16] Multi-Modality Face Recognition: An Information Theoretic Approach
    Supriya, Musica
    Singh, Sanjay
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1080 - 1084
  • [17] Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition
    Zhu, Xiaoguang
    Zhu, Ye
    Wang, Haoyu
    Wen, Honglin
    Yan, Yan
    Liu, Peilin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (03)
  • [18] Modality Distillation with Multiple Stream Networks for Action Recognition
    Garcia, Nuno C.
    Morerio, Pietro
    Murino, Vittorio
    COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 106 - 121
  • [19] Convolutional non-local spatial-temporal learning for multi-modality action recognition
    Ren, Ziliang
    Yuan, Huaqiang
    Wei, Wenhong
    Zhao, Tiezhu
    Zhang, Qieshi
    ELECTRONICS LETTERS, 2022, 58 (20) : 765 - 767
  • [20] 3D network with channel excitation and knowledge distillation for action recognition
    Hu, Zhengping
    Mao, Jianzeng
    Yao, Jianxin
    Bi, Shuai
    FRONTIERS IN NEUROROBOTICS, 2023, 17