LAE-Net: Light and Efficient Network for Compressed Video Action Recognition

被引：2

作者：

Guo, Jinxin ^{[1
]}

Zhang, Jiaqiang ^{[1
]}

Zhang, Xiaojing ^{[1
]}

Ma, Ming ^{[1
]}

机构：

[1] Inner Mongolia Univ, Hohhot, Peoples R China

来源：

MULTIMEDIA MODELING, MMM 2023, PT II | 2023年 / 13834卷

关键词：

Action recognition; Compressed video; Transfer learning;

D O I：

10.1007/978-3-031-27818-1_22

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Action recognition is a crucial task in computer vision and video analysis. The Two-stream network and 3D ConvNets are representative works. Although both of them have achieved outstanding performance, the optical flow and 3D convolution require huge computational effort, without taking into account the need for real-time applications. Current work extracts motion vectors and residuals directly from the compressed video to replace optical flow. However, due to the noisy and inaccurate representation of the motion, the accuracy of the model is significantly decreased when using motion vectors as input. Besides the current works focus only on improving accuracy or reducing computational cost, without exploring the tradeoff strategy between them. In this paper, we propose a light and efficient multi-stream framework, including a motion temporal fusion module (MTFM) and a double compressed knowledge distillation module (DCKD). MTFM improves the network's ability to extract complete motion information and compensates to some extent for the problem of inaccurate description of motion information by motion vectors in compressed video. DCKD allows the student network to gain more knowledge from teacher with less parameters and input frames. Experimental results on the two public benchmarks(UCF-101 and HMDB-51) outperform the state of the art on the compressed domain.

引用

页码：265 / 276

页数：12

共 50 条

[1] FREQUENCY ENHANCEMENT NETWORK FOR EFFICIENT COMPRESSED VIDEO ACTION RECOGNITION
Ming, Yue
Xiong, Lu
Jia, Xia
Zheng, Qingfang
Zhou, Jiangwan
Feng, Fan
Hu, Nannan
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 825 - 829
[2] LAE-Net: A locally-adaptive emb e dding network for low-light image enhancement
Liu, Xiaokai
Ma, Weihao
Ma, Xiaorui
Wang, Jie
PATTERN RECOGNITION, 2023, 133
[3] AE-Net:Adjoint Enhancement Network for Efficient Action Recognition in Video Understanding
Wang, Bin
Liu, Chunsheng
Chang, Faliang
Wang, Wenqian
Li, Nanjun
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 5458 - 5468
[4] Dynamic Spatial Focus for Efficient Compressed Video Action Recognition
Zheng, Ziwei
Yang, Le
Wang, Yulin
Zhang, Miao
He, Lijun
Huang, Gao
Li, Fan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 695 - 708
[5] Compressed Video Action Recognition
Wu, Chao-Yuan
Zaheer, Manzil
Hu, Hexiang
Manmatha, R.
Smola, Alexander J.
Krahenbuhl, Philipp
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6026 - 6035
[6] Action Keypoint Network for Efficient Video Recognition
Chen, Xu
Han, Yahong
Wang, Xiaohan
Sun, Yifan
Yang, Yi
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4980 - 4993
[7] EAC-Net: Efficient and Accurate Convolutional Network for Video Recognition
Jin, Bowei
Xu, Zhuo
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11149 - 11156
[8] DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition
Shou, Zheng
Lin, Xudong
Kalantidis, Yannis
Sevilla-Lara, Laura
Rohrbach, Marcus
Chang, Shih-Fu
Yan, Zhicheng
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1268 - 1277
[9] EPAM-Net: An efficient pose-driven attention-guided multimodal network for video action recognition
Abdelkawy, Ahmed
Ali, Asem
Farag, Aly
NEUROCOMPUTING, 2025, 633
[10] Multi-Stream Single Network: Efficient Compressed Video Action Recognition With a Single Multi-Input Multi-Output Network
Terao, Hayato
Noguchi, Wataru
Iizuka, Hiroyuki
Yamamoto, Masahito
IEEE ACCESS, 2024, 12 : 20983 - 20997

← 1 2 3 4 5 →