Multimodal Multipart Learning for Action Recognition in Depth Videos

被引:76
|
作者
Shahroudy, Amir [1 ,2 ]
Ng, Tian-Tsong [2 ]
Yang, Qingxiong [3 ]
Wang, Gang [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Inst Infocomm Res, 1 Fusionopolis Way, Singapore 138632, Singapore
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
基金
新加坡国家研究基金会;
关键词
Action recognition; kinect; joint sparse regression; mixed norms; structured sparsity; group feature selection; MULTITASK; FEATURES; SELECTION; TRACKING; SPARSITY; MODEL;
D O I
10.1109/TPAMI.2015.2505295
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on these partial descriptors. We propose a joint sparse regression based learning method which utilizes the structured sparsity to model each action as a combination of multimodal features from a sparse set of body parts. To represent dynamics and appearance of parts, we employ a heterogeneous set of depth and skeleton based features. The proper structure of multimodal multipart features are formulated into the learning framework via the proposed hierarchical mixed norm, to regularize the structured features of each part and to apply sparsity between them, in favor of a group feature selection. Our experimental results expose the effectiveness of the proposed learning method in which it outperforms other methods in all three tested datasets while saturating one of them by achieving perfect accuracy.
引用
收藏
页码:2123 / 2129
页数:7
相关论文
共 50 条
  • [41] Action recognition in still images by learning spatial interest regions from videos
    Eweiwi, Abdalrahman
    Cheema, Muhammad Shahzad
    Bauckhage, Christian
    PATTERN RECOGNITION LETTERS, 2015, 51 : 8 - 15
  • [42] A Novel Dictionary Learning based Multiple Instance Learning Approach to Action Recognition from Videos
    Roy, Abhinaba
    Banerjee, Biplab
    Murino, Vittorio
    ICPRAM: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2017, : 519 - 526
  • [43] A Comparative Study on Deep Learning and Machine Learning Models for Human Action Recognition in Aerial Videos
    Kapoor, Surbhi
    Sharma, Akashdeep
    Verma, Amandeep
    Dhull, Vishal
    Goyal, Chahat
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (04) : 567 - 574
  • [44] Collaborative multimodal feature learning for RGB-D action recognition
    Kong, Jun
    Liu, Tianshan
    Jiang, Min
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 : 537 - 549
  • [45] A Deep Reinforcement Learning Method For Multimodal Data Fusion in Action Recognition
    Guo, Jiale
    Liu, Qiang
    Chen, Enqing
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 120 - 124
  • [46] Video Abnormal Action Recognition Based on Multimodal Heterogeneous Transfer Learning
    Huang, Hong-Bo
    Zheng, Yao-Lin
    Hu, Zhi-Ying
    ADVANCES IN MULTIMEDIA, 2024, 2024
  • [47] MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D Videos
    Yu, Bruce X. B.
    Liu, Yan
    Zhang, Xiang
    Zhong, Sheng-hua
    Chan, Keith C. C.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3522 - 3538
  • [48] Multimodal Biometrics Recognition Using a Deep Convolutional Neural Network with Transfer Learning in Surveillance Videos
    Aung, Hsu Mon Lei
    Pluempitiwiriyawej, Charnchai
    Hamamoto, Kazuhiko
    Wangsiripitak, Somkiat
    COMPUTATION, 2022, 10 (07)
  • [49] Emotion Recognition in Videos via Fusing Multimodal Features
    Chen, Shizhe
    Dian, Yujie
    Li, Xinrui
    Lin, Xiaozhu
    Jin, Qin
    Liu, Haibo
    Lu, Li
    PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 632 - 644
  • [50] Socializing the Videos: A Multimodal Approach for Social Relation Recognition
    Xu, Tong
    Zhou, Peilun
    Hu, Linkang
    He, Xiangnan
    Hu, Yao
    Chen, Enhong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)