Human Action Recognition Combined With Object Detection

被引:0
|
作者
Zhou B. [1 ]
Li J.-F. [1 ]
机构
[1] Institute of Automation, Faculty of Mechanical Engineering and Automation, Zhejiang Sci-Tech University, Hangzhou
来源
关键词
Action recognition; Computer vision; Convolutional neural network (CNN); Deep learning; Object detection;
D O I
10.16383/j.aas.c180848
中图分类号
学科分类号
摘要
Most of the research methods in the field of human action recognition extract relevant features from the original video frames. These methods introduce more or less redundant background information, which brings more noise to the neural network. In order to solve the problem of background information interference, large amount of redundant information in video frames, unbalanced sample classification and difficult classification of individual classes, this paper proposes a new algorithm for human action recognition combined with object detection. Firstly, the object detection mechanism is added in the process of human action recognition, so that the neural network has a focus on learning the motion information of the human body. Secondly, the video is segmentally and randomly sampled to establish long-term time domain modeling across the entire video segment. Finally, action recognition is performed through an improved neural network loss function. In this work, a large number of experimental analyses are performed on the popular human action recognition datasets UCF101 and HDBM51. The accuracy of human action recognition (RGB images only) is 96.0% and 75.3%, respectively, which is significantly higher than the state-of-the-art human action recognition algorithms. Copyright © 2020 Acta Automatica Sinica. All rights reserved.
引用
收藏
页码:1961 / 1970
页数:9
相关论文
共 31 条
  • [31] Wang L M, Li W, Li W, Van Gool L., Appearance-and-relation networks for video classification, Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1430-1439, (2018)