KORSAL: Key-Point Based Online Real-Time Spatio-Temporal Action Localization

被引:0
|
作者
Abeywardena, Kalana [1 ]
Sumanthiran, Shechem [1 ]
Jayasundara, Sakuna [1 ]
Karunasena, Sachira [1 ]
Rodrigo, Ranga [1 ]
Jayasekara, Peshala [1 ]
机构
[1] Univ Moratuwa, Dept Elect & Telecommun Engn, Moratuwa, Sri Lanka
来源
2023 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CCECE | 2023年
关键词
Action Localization; Spatio-Temporal; Online; Real-time; EVENT DETECTION;
D O I
10.1109/CCECE58730.2023.10288973
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Real-time and online action localization in videos poses a critical and formidable challenge. Achieving accurate action localization necessitates the integration of both temporal and spatial information. However, existing approaches rely on computationally intensive 3D convolutional neural network (CNN) architectures or redundant two-stream architectures with optical flow, rendering them unsuitable for real-time, online applications. To address this, we propose a novel approach that leverages fast and efficient key-point-based bounding box prediction for spatial action localization. Additionally, we introduce a tube-linking algorithm that ensures the temporal continuity of action tubes even in the presence of occlusions. By combining temporal and spatial information into a cascaded input for a single network, we eliminate the need for a two-stream architecture, enabling the network to effectively learn from both types of information. Instead of using computationally demanding optical flow, we extract temporal information efficiently using a structural similarity index map. Despite the simplicity of our approach, our lightweight end-to-end architecture achieves state-of-the-art frame mean average precision (mAP) of 74.7% on the challenging UCF101-24 dataset, demonstrating a notable performance gain of 6.4% over previous online methods. Moreover, we achieve state-of-the-art video mAP results compared to both online and offline methods. Furthermore, our model achieves a frame rate of 41.8 FPS (Frames per second), representing a 10.7% improvement over contemporary real-time methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Real-time Spatio-Temporal Action Localization in 360 Videos
    Chen, Bo
    Ali-Eldin, Ahmed
    Shenoy, Prashant
    Nahrsted, Klara
    2020 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2020), 2020, : 73 - 76
  • [2] Learning motion representation for real-time spatio-temporal action localization
    Zhang, Dejun
    He, Linchao
    Tu, Zhigang
    Zhang, Shifu
    Han, Fei
    Yang, Boxiong
    PATTERN RECOGNITION, 2020, 103
  • [3] Real-time Online Action Detection Forests using Spatio-temporal Contexts
    Baek, Seungryul
    Kim, Kwang In
    Kim, Tae-Kyun
    2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, : 158 - 167
  • [4] Real-Time Action Detection Based on Spatio-Temporal Interaction Perception
    Ke X.
    Miao X.
    Guo W.-Z.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (02): : 574 - 588
  • [5] Real-Time Spatio-Temporal LiDAR Point Cloud Compression
    Feng, Yu
    Liu, Shaoshan
    Zhu, Yuhao
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 10766 - 10773
  • [6] Cascading spatio-temporal attention network for real-time action detection
    Yang, Jianhua
    Wang, Ke
    Li, Ruifeng
    Perner, Petra
    MACHINE VISION AND APPLICATIONS, 2023, 34 (06)
  • [7] Cascading spatio-temporal attention network for real-time action detection
    Jianhua Yang
    Ke Wang
    Ruifeng Li
    Petra Perner
    Machine Vision and Applications, 2023, 34
  • [8] Spatio-temporal view interpolation in real-time
    Radtke, T
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2003, PTS 1-3, 2003, 5150 : 1939 - 1946
  • [9] Mars: Real-time Spatio-temporal Queries on Microblogs
    Magdy, Amr
    Aly, Ahmed M.
    Mokbel, Mohamed F.
    Elnikety, Sameh
    He, Yuxiong
    Nath, Suman
    2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2014, : 1238 - 1241
  • [10] Spatio-temporal modeling for real-time ozone forecasting
    Paci, Lucia
    Gelfand, Alan E.
    Holland, David M.
    SPATIAL STATISTICS, 2013, 4 : 79 - 93