Focal Channel Knowledge Distillation for Multi-Modality Action Recognition

被引:1
|
作者
Gan, Lipeng [1 ]
Cao, Runze [1 ]
Li, Ning [1 ]
Yang, Man [1 ]
Li, Xiaochao [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Dept Microelect & lntegrated Circuit, Xiamen 361005, Peoples R China
[2] Xiamen Univ Malaysia, Dept Elect & Elect Engn, Sepang 43900, Selangor, Malaysia
[3] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
来源
IEEE ACCESS | 2023年 / 11卷
关键词
Action recognition; knowledge distillation; multi-modality;
D O I
10.1109/ACCESS.2023.3298647
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The multi-modality action recognition aims to learn the complementary information from multiple modalities to improve the action recognition performance. However, there exists a significant modality channel difference, the equal transferring channel semantic features from multi-modalities to RGB will result in competition and redundancy during knowledge distillation. To address this issue, we propose a focal channel knowledge distillation strategy to transfer the key semantic correlations and distributions of multi-modality teachers into the RGB student network. The focal channel correlations provide intrinsic relationships and diversity properties of key semantics, and focal channel distributions provide salient channel activation of features. By ignoring the less-discriminative and irrelevant channels, the student can more efficiently utilize the channel capability to learn the complementary semantic features from the other modalities. Our focal channel knowledge distillation achieves 91.2%, 95.6%, 98.3% and 81.0% accuracy with 4.5%, 4.2%, 3.7% and 7.1% improvement on NTU 60 (CS), UTD-MHAD, N-UCLA and HMDB51 datasets comparing to unimodal RGB models. This focal channel knowledge distillation framework can also be integrated with the unimodal models to achieve the state-of-the-art performance. The extensive experiments show that the proposed method achieves 92.5%, 96.0%, 98.9%, and 82.3% accuracy on NTU 60 (CS), UTD-MHAD, N-UCLA, and HMDB51 datasets respectively.
引用
收藏
页码:78285 / 78298
页数:14
相关论文
共 50 条
  • [41] Multi-modality phantom development
    Huber, J. S.
    Peng, Q.
    Moses, W. W.
    2007 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD, VOLS 1-11, 2007, : 2944 - 2948
  • [42] Multi-Modality Multi-Task Recurrent Neural Network for Online Action Detection
    Liu, Jiaying
    Li, Yanghao
    Song, Sijie
    Xing, Junliang
    Lan, Cuiling
    Zeng, Wenjun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (09) : 2667 - 2682
  • [43] CROSS-MODAL KNOWLEDGE DISTILLATION FOR ACTION RECOGNITION
    Thoker, Fida Mohammad
    Gall, Juergen
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 6 - 10
  • [44] Multi-Modality Emotion Recognition Model with GAT-Based Multi-Head Inter-Modality Attention
    Fu, Changzeng
    Liu, Chaoran
    Ishi, Carlos Toshinori
    Ishiguro, Hiroshi
    SENSORS, 2020, 20 (17) : 1 - 15
  • [45] The Development of a Multi-Modality Emotion Recognition Test Presented via a Mobile Application
    Yu, Rwei-Ling
    Poon, Shu-Fai
    Yi, Hsin-Jou
    Chien, Chia-Yi
    Hsu, Pei-Hsuan
    BRAIN SCIENCES, 2022, 12 (02)
  • [46] Multi-modality Gesture Detection and Recognition with Un-supervision, Randomization and Discrimination
    Chen, Guang
    Clarke, Daniel
    Giuliani, Manuel
    Gaschler, Andre
    Wu, Di
    Weikersdorfer, David
    Knoll, Alois
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, 2015, 8925 : 608 - 622
  • [47] Dynamic Gesture Recognition Based on the Multi-modality Fusion Temporal Segment Networks
    Zheng, Mingyao
    Tie, Yun
    Qi, Lin
    Jiang, Shengnan
    2019 8TH INTERNATIONAL SYMPOSIUM ON NEXT GENERATION ELECTRONICS (ISNE), 2019,
  • [48] Progressive Modality Cooperation for Multi-Modality Domain Adaptation
    Zhang, Weichen
    Xu, Dong
    Zhang, Jing
    Ouyang, Wanli
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3293 - 3306
  • [49] Multi-teacher knowledge distillation for compressed video action recognition based on deep learning
    Wu, Meng-Chieh
    Chiu, Ching-Te
    JOURNAL OF SYSTEMS ARCHITECTURE, 2020, 103
  • [50] Harnessing Multi-modality and Expert Knowledge for Adverse Events Prediction in Clinical Notes
    Postiglione, Marco
    Esposito, Giovanni
    Izzo, Raffaele
    La Gatta, Valerio
    Moscato, Vincenzo
    Piccolo, Raffaele
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2023 WORKSHOPS, PT II, 2024, 14366 : 119 - 130