Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition

被引:214
|
作者
Sun, Shuyang [1 ,2 ]
Kuang, Zhanghui [2 ]
Sheng, Lu [3 ]
Ouyang, Wanli [1 ]
Zhang, Wei [2 ]
机构
[1] Univ Sydney, Sydney, NSW, Australia
[2] SenseTime Res, Hong Kong, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
关键词
D O I
10.1109/CVPR.2018.00151
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Motion representation plays a vital role in human action recognition in videos. In this study, we introduce a novel compact motion representation for video action recognition, named Optical Flow guided Feature (OFF), which enables the network to distill temporal information through a fast and robust approach. The OFF is derived from the definition of optical flow and is orthogonal to the optical flow. The derivation also provides theoretical support for using the difference between two frames. By directly calculating pixel-wise spatio-temporal gradients of the deep feature maps, the OFF could be embedded in any existing CNN based video action recognition framework with only a slight additional cost. It enables the CNN to extract spatiotemporal information, especially the temporal information between frames simultaneously. This simple but powerful idea is validated by experimental results. The network with OFF fed only by RGB inputs achieves a competitive accuracy of 93.3% on UCF-101, which is comparable with the result obtained by two streams (RGB and optical flow), but is 15 times faster in speed. Experimental results also show that OFF is complementary to other motion modalities such as optical flow. When the proposed method is plugged into the state-of-the-art video action recognition framework, it has 96.0% and 74.2% accuracy on UCF-101 and HMDB-51 respectively. The code for this project is available at: https://github.com/kevin-ssy/Optical-Flow-Guided-Feature
引用
收藏
页码:1390 / 1399
页数:10
相关论文
共 50 条
  • [1] MV2Flow: Learning Motion Representation for Fast Compressed Video Action Recognition
    Hu, Hezhen
    Zhou, Wengang
    Li, Xingze
    Yan, Ning
    Li, Houqiang
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 16 (03)
  • [2] A Robust and Efficient Video Representation for Action Recognition
    Heng Wang
    Dan Oneata
    Jakob Verbeek
    Cordelia Schmid
    International Journal of Computer Vision, 2016, 119 : 219 - 238
  • [3] A Robust and Efficient Video Representation for Action Recognition
    Wang, Heng
    Oneata, Dan
    Verbeek, Jakob
    Schmid, Cordelia
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 119 (03) : 219 - 238
  • [4] Motion Feature Combination for Human Action Recognition in Video
    Meng, Hongying
    Pears, Nick
    Bailey, Chris
    COMPUTER VISION AND COMPUTER GRAPHICS, 2008, 21 : 151 - +
  • [5] Unsupervised feature extraction for the representation and recognition of lip motion video
    Lee, Michelle Jeungeun
    Lee, Kyungsuk David
    Lee, Soo-Young
    COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, PT 3, PROCEEDINGS, 2006, 4115 : 741 - 746
  • [6] Space-Time Robust Video Representation for Action Recognition
    Ballas, Nicolas
    Yang, Yi
    Lan, Zhen-zhong
    Delezoide, Betrand
    Preteux, Francoise
    Hauptmann, Alex
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 2704 - 2711
  • [7] Motion Guided Feature-Augmented Network for Action Recognition
    Zheng, Zhenxing
    An, Gaoyun
    Ruan, Qiuqi
    PROCEEDINGS OF 2020 IEEE 15TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2020), 2020, : 391 - 394
  • [8] Motion Flow Feature Algorithm for Action Recognition in Videos
    Ye, Run
    Yan, Bin
    Hou, Shiyou
    Jing, Xiaokang
    2020 13TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2020), 2020, : 188 - 193
  • [9] AN OPTICAL FLOW FEATURE-BASED ROBUST FACIAL EXPRESSION RECOGNITION WITH HMM FROM VIDEO
    Uddin, Md. Zia
    Kim, Tae-Seong
    Song, Byung Cheol
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2013, 9 (04): : 1409 - 1421
  • [10] Slow feature subspace: A video representation based on slow feature analysis for action recognition
    Beleza, Suzana Rita Alves
    Shimomoto, Erica K.
    Souza, Lincon S.
    Fukui, Kazuhiro
    MACHINE LEARNING WITH APPLICATIONS, 2023, 14