Multimodal human action recognition based on spatio-temporal action representation recognition model

被引：0

作者：

Qianhan Wu

Qian Huang

Xing Li

机构：

[1] Hohai University,The Key Laboratory of Water Big Data Technology of Ministry of Water Resources

[2] Hohai University,School of Computer and Information

来源：

Multimedia Tools and Applications | 2023年 / 82卷

关键词：

Human action recognition; Multimode learning; HP-DMI; ST-GCN extractor; HTMCCA;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Human action recognition methods based on single-modal data lack adequate information. It is necessary to propose the methods based on multimodal data and the fusion algorithms to fuse different features. Meanwhile, the existing features extracted from depth videos and skeleton sequences are not representative. In this paper, we propose a new model named Spatio-temporal Action Representation Recognition Model for recognizing human actions. This model proposes a new depth feature map called Hierarchical Pyramid Depth Motion Images (HP-DMI) to represent depth videos and adopts Spatial-temporal Graph Convolutional Networks (ST-GCN) extractor to summarize skeleton features named Spatio-temporal Joint Descriptors (STJD). Histogram of Oriented Gradient (HOG) is used on HP-DMI to extract HP-DMI-HOG features. Then two kinds of features are input into a fusion algorithm High Trust Mean Canonical correlation analysis (HTMCCA). HTMCCA mitigates the impact of noisy samples on multi-feature fusion and reduces computational complexity. Finally, Support Vector Machine (SVM) is used for human action recognition. To evaluate the performance of our approach, several experiments are conducted on two public datasets. Eexperiments results prove its effectiveness.

引用

页码：16409 / 16430

页数：21

共 50 条

[41] Action recognition using spatio-temporal regularity based features
Goodhart, Taylor
Yan, Pingkun
Shah, Mubarak
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 745 - 748
[42] Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition
Li, Chaolong
Cui, Zhen
Zheng, Wenming
Xu, Chunyan
Yang, Jian
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3482 - 3489
[43] Integrally Cooperative Spatio-Temporal Feature Representation of Motion Joints for Action Recognition
Chao, Xin
Hou, Zhenjie
Liang, Jiuzhen
Yang, Tianjin
SENSORS, 2020, 20 (18) : 1 - 22
[44] Spatio-temporal action localization and detection for human recognition in big dataset
Megrhi, Sameh
Jmal, Marwa
Souidene, Wided
Beghdadi, Azeddine
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2016, 41 : 375 - 390
[45] Mutually Reinforced Spatio-Temporal Convolutional Tube for Human Action Recognition
Wu, Haoze
Liu, Jiawei
Zha, Zheng-Jun
Chen, Zhenzhong
Sun, Xiaoyan
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 968 - 974
[46] Evaluation of Color Spatio-Temporal Interest Points for Human Action Recognition
Everts, Ivo
van Gemert, Jan C.
Gevers, Theo
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (04) : 1569 - 1580
[47] Intelligent attendance monitoring system with spatio-temporal human action recognition
Ming-Fong Tsai
Min-Hao Li
Soft Computing, 2023, 27 : 5003 - 5019
[48] Human Action Recognition using Factorized Spatio-Temporal Convolutional Networks
Sun, Lin
Jia, Kui
Yeung, Dit-Yan
Shi, Bertram E.
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4597 - 4605
[49] Spatio-Temporal Human-Object Interactions for Action Recognition in Videos
Escorcia, Victor
Carlos Niebles, Juan
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2013, : 508 - 514
[50] Human action recognition using Local Spatio-Temporal Discriminant Embedding
Jia, Kui
Yeung, Dit-Yan
2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 3040 - +

← 1 2 3 4 5 →