A Hierarchical Spatio-Temporal Model for Human Activity Recognition

被引:33
|
作者
Xu, Wanru [1 ]
Miao, Zhenjiang [1 ]
Zhang, Xiao-Ping [2 ]
Tian, Yi [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing 100044, Peoples R China
[2] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON M5B 2K3, Canada
关键词
Activity recognition; hidden conditional random field (HCRF); hierarchical structure; spatio-temporal dependencies; HIDDEN MARKOV MODEL; FRAMEWORK;
D O I
10.1109/TMM.2017.2674622
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There are two key issues in human activity recognition: spatial dependencies and temporal dependencies. Most recent methods focus on only one of them, and thus do not have sufficient descriptive power to recognize complex activity. In this paper, we propose a hierarchical spatio-temporal model (HSTM) to solve the problem by modeling spatial and temporal constraints simultaneously. The new HSTM is a two-layer hidden conditional random field (HCRF), where the bottom-layer HCRF aims at describing spatial relations in each frame and learning more discriminative representations, and the top-layer HCRF utilizes these high-level features to characterize temporal relations in the whole video sequence. The new HSTM takes advantage of the bottom layer as the building blocks for the top layer and it aggregates evidence from local to global level. A novel learning algorithm is derived to train all model parameters efficiently and its effectiveness is validated theoretically. Experimental results show that the HSTM can successfully classify human activities with higher accuracies on single-person actions (UCF) than other existing methods. More importantly, the HSTM also achieves superior performance on more practical interactions, including human-human interactional activities (UT-Interaction, BIT-Interaction, and CASIA) and human-object interactional activities (Gupta video dataset).
引用
收藏
页码:1494 / 1509
页数:16
相关论文
共 50 条
  • [41] Spatio-Temporal Triangular-Chain CRF for Activity Recognition
    Cao, Congqi
    Zhang, Yifan
    Lu, Hanging
    MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1151 - 1154
  • [42] Spatio-Temporal Dynamic Inference Network for Group Activity Recognition
    Yuan, Hangjie
    Ni, Dong
    Wang, Mang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7456 - 7465
  • [43] Spatio-Temporal Interest Points Chain (STIPC) for Activity Recognition
    Yuan, Fei
    Xia, Gui-Song
    Sahbi, Hichem
    Prinet, Veronique
    2011 FIRST ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2011, : 22 - 26
  • [44] Spatio-Temporal Activity Detection and Recognition in Untrimmed Surveillance Videos
    Gkountakos, Konstantinos
    Touska, Despoina
    Ioannidis, Konstantinos
    Tsikrika, Theodora
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 451 - 455
  • [45] Spatio-Temporal Activity Recognition for Evolutionary Search Behavior Prediction
    Friess, Stephen
    Tino, Peter
    Menzel, Stefan
    Xu, Zhao
    Sendhoff, Bernhard
    Yao, Xin
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [46] Learning Bag of Spatio-Temporal Features for Human Interaction Recognition
    Slimani, Khadidja Nour El Houda
    Benezeth, Yannick
    Souami, Feryel
    TWELFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2019), 2020, 11433
  • [47] Spatio-Temporal VLAD Encoding for Human Action Recognition in Videos
    Duta, Ionut C.
    Ionescu, Bogdan
    Aizawa, Kiyoharu
    Sebe, Nicu
    MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 365 - 378
  • [48] Spatio-Temporal Deep Residual Network with Hierarchical Attentions for Video Event Recognition
    Li, Yonggang
    Liu, Chunping
    Ji, Yi
    Gong, Shengrong
    Xu, Haibao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (02)
  • [49] Human Action Recognition Based on a Spatio-Temporal Video Autoencoder
    Sousa e Santos, Anderson Carlos
    Pedrini, Helio
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2020, 34 (11)
  • [50] Spatio-temporal method for the recognition of human actions in the canonical space
    Gómez-Conde, I.
    Olivieri, D.N.
    Vila, X.A.
    RISTI - Revista Iberica de Sistemas e Tecnologias de Informacao, 2011, (08): : 1 - 14