Accurate 3D action recognition using learning on the Grassmann manifold

被引:149
|
作者
Slama, Rim [1 ,2 ]
Wannous, Hazem [1 ,2 ]
Daoudi, Mohamed [2 ,3 ]
Srivastava, Anuj [4 ]
机构
[1] Univ Lille 1, F-59655 Villeneuve Dascq, France
[2] CNRS, UMR 8022, LIFL Lab, Villeneuve Dascq, France
[3] Inst Mines Telecom Telecom Lille, Villeneuve Dascq, France
[4] Florida State Univ, Dept Stat, Tallahassee, FL 32306 USA
基金
美国国家科学基金会;
关键词
Human action recognition; Grassmann manifold; Observational latency; Depth images; Skeleton; Classification; SPARSE REPRESENTATION; VIDEO; ALGORITHMS;
D O I
10.1016/j.patcog.2014.08.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we address the problem of modeling and analyzing human motion by focusing on 3D body skeletons. Particularly, our intent is to represent skeletal motion in a geometric and efficient way, leading to an accurate action-recognition system. Here an action is represented by a dynamical system whose observability matrix is characterized as an element of a Grassmann manifold. To formulate our learning algorithm, we propose two distinct ideas: (1) in the first one we perform classification using a Truncated Wrapped Gaussian model, one for each class in its own tangent space. (2) In the second one we propose a novel learning algorithm that uses a vector representation formed by concatenating local coordinates in tangent spaces associated with different classes and training a linear SVM. We evaluate our approaches on three public 3D action datasets: MSR-action 3D, UT-kinect and UCF-kinect datasets; these datasets represent different kinds of challenges and together help provide an exhaustive evaluation. The results show that our approaches either match or exceed state-of-the-art performance reaching 91.21% on MSR-action 3D, 97.91% on UCF-kinect, and 88.5% on UT-kinect. Finally, we evaluate the latency, i.e. the ability to recognize an action before its termination, of our approach and demonstrate improvements relative to other published approaches. (C)2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:556 / 567
页数:12
相关论文
共 50 条
  • [21] Automatic Recognition of Space-Time Constellations by Learning on the Grassmann Manifold
    Du, Yuqing
    Zhu, Guangxu
    Zhang, Jiayao
    Huang, Kaibin
    2018 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2018,
  • [22] 3D CARICATURE GENERATION BY MANIFOLD LEARNING
    Li, Pengfei
    Chen, Yiqiang
    Liu, Junfa
    Fu, Guanhua
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 941 - 944
  • [23] Learning Match Kernels on Grassmann Manifolds for Action Recognition
    Zhang, Lei
    Zhen, Xiantong
    Shao, Ling
    Song, Jingkuan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (01) : 205 - 215
  • [24] 3D facial expression recognition using kernel methods on Riemannian manifold
    Hariri, Walid
    Tabia, Hedi
    Farah, Nadir
    Benouareth, Abdallah
    Declercq, David
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2017, 64 : 25 - 32
  • [25] Manifold Matrices-based Attention Mechanisms on 3D Skeletons for Human Action Recognition
    Li, Guang
    Ding, Chongyang
    Li, Jianjun
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2025,
  • [26] Deep Learning-Based Action Recognition Using 3D Skeleton Joints Information
    Tasnim, Nusrat
    Islam, Md. Mahbubul
    Baek, Joong-Hwan
    INVENTIONS, 2020, 5 (03) : 1 - 15
  • [27] A hybrid deep learning architecture using 3D CNNs and GRUs for human action recognition
    Savadi Hosseini M.
    Ghaderi F.
    International Journal of Engineering, Transactions B: Applications, 2020, 33 (05): : 959 - 965
  • [28] Learning 3D Skeletal Representation From Transformer for Action Recognition
    Cha, Junuk
    Saqlain, Muhammad
    Kim, Donguk
    Lee, Seungeun
    Lee, Seongyeong
    Baek, Seungryul
    IEEE ACCESS, 2022, 10 : 67541 - 67550
  • [29] Spatiotemporal Multimodal Learning With 3D CNNs for Video Action Recognition
    Wu, Hanbo
    Ma, Xin
    Li, Yibin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (03) : 1250 - 1261
  • [30] Human Action Recognition Using 3D Zernike Moments
    Arik, Okay
    Bingol, A. Semih
    2014 11TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2014,