Deep Learning Approach for Human Action Recognition Using a Time Saliency Map Based on Motion Features Considering Camera Movement and Shot in Video Image Sequences

被引：4

作者：

Alavigharahbagh, Abdorreza ^{[1
]}

Hajihashemi, Vahid ^{[1
]}

Machado, Jose J. M. ^{[2
]}

Tavares, Joao Manuel R. S. ^{[2
]}

Moscato, Vincenzo

机构：

[1] Univ Porto, Fac Engn, Rua Dr Roberto Frias S-N, P-4200465 Porto, Portugal

[2] Univ Porto, Fac Engn, Dept Engn Mecan, Rua Dr Roberto Frias S-N, P-4200465 Porto, Portugal

来源：

INFORMATION | 2023年 / 14卷 / 11期

关键词：

Human Action Recognition (HAR); deep learning; RNN; time saliency map; camera's movement cancellation; REPRESENTATION; FLOW;

D O I：

10.3390/info14110616

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this article, a hierarchical method for action recognition based on temporal and spatial features is proposed. In current HAR methods, camera movement, sensor movement, sudden scene changes, and scene movement can increase motion feature errors and decrease accuracy. Another important aspect to take into account in a HAR method is the required computational cost. The proposed method provides a preprocessing step to address these challenges. As a preprocessing step, the method uses optical flow to detect camera movements and shots in input video image sequences. In the temporal processing block, the optical flow technique is combined with the absolute value of frame differences to obtain a time saliency map. The detection of shots, cancellation of camera movement, and the building of a time saliency map minimise movement detection errors. The time saliency map is then passed to the spatial processing block to segment the moving persons and/or objects in the scene. Because the search region for spatial processing is limited based on the temporal processing results, the computations in the spatial domain are drastically reduced. In the spatial processing block, the scene foreground is extracted in three steps: silhouette extraction, active contour segmentation, and colour segmentation. Key points are selected at the borders of the segmented foreground. The last used features are the intensity and angle of the optical flow of detected key points. Using key point features for action detection reduces the computational cost of the classification step and the required training time. Finally, the features are submitted to a Recurrent Neural Network (RNN) to recognise the involved action. The proposed method was tested using four well-known action datasets: KTH, Weizmann, HMDB51, and UCF101 datasets and its efficiency was evaluated. Since the proposed approach segments salient objects based on motion, edges, and colour features, it can be added as a preprocessing step to most current HAR systems to improve performance.

引用

页数：27

共 14 条

[1] Timed-image based deep learning for action recognition in video sequences
Atto, Abdourrahmane Mahamane
Benoit, Alexandre
Lambert, Patrick
PATTERN RECOGNITION, 2020, 104
[2] Skeleton Motion History based Human Action Recognition Using Deep Learning
Phyo, Cho Nilar
Zin, Thi Thi
Tin, Pyke
2017 IEEE 6TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS (GCCE), 2017,
[3] Motion Signal-based Recognition of Human Activity from Video Stream Dataset Using Deep Learning Approach
Yadav R.K.
Arockiam D.
Semwal V.B.
Recent Advances in Computer Science and Communications, 2024, 17 (03)
[4] Unethical human action recognition using deep learning based hybrid model for video forensics
Raghavendra Gowada
Digambar Pawar
Biplab Barman
Multimedia Tools and Applications, 2023, 82 : 28713 - 28738
[5] Recent Advances in Video-Based Human Action Recognition using Deep Learning: A Review
Wu, Di
Sharma, Nabin
Blumenstein, Michael
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 2865 - 2872
[6] Unethical human action recognition using deep learning based hybrid model for video forensics
Gowada, Raghavendra
Pawar, Digambar
Barman, Biplab
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 28713 - 28738
[7] Human Action Recognition in Video Sequence using Logistic Regression by Features Fusion Approach based on CNN Features
Ahmad, Tariq
Wu, Jinsong
Khan, Imran
Rahim, Asif
Khan, Amjad
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (11) : 18 - 25
[8] Zero-Shot Learning Based Approach For Medieval Word Recognition Using Deep-Learned Features
Chanda, Sukalpa
Baas, Jochem
Haitink, Daniel
Hamel, Sebastien
Stutzmann, Dominique
Schomaker, Lambert
PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 345 - 350
[9] KNN-Based Machine Learning Classifier Used on Deep Learned Spatial Motion Features for Human Action Recognition
Paramasivam, Kalaivani
Sindha, Mohamed Mansoor Roomi
Balakrishnan, Sathya Bama
ENTROPY, 2023, 25 (06)
[10] Deep Learning and Kurtosis-Controlled, Entropy-Based Framework for Human Gait Recognition Using Video Sequences
Sharif, Muhammad Imran
Khan, Muhammad Attique
Alqahtani, Abdullah
Nazir, Muhammad
Alsubai, Shtwai
Binbusayyis, Adel
Damasevicius, Robertas
ELECTRONICS, 2022, 11 (03)

← 1 2 →