Action Recognition Using Multi-stream 2D CNN with Deep Learning-Based Temporal Modality

被引:1
|
作者
Kang, Keonwoo [1 ]
Park, Sangwoo [1 ]
Park, Hasil [1 ]
Kang, Donggoo [1 ]
Paik, Joonki [1 ,2 ]
机构
[1] Chung Ang Univ, Grad Sch Adv Imaging Sci Multimedia & Film, Seoul, South Korea
[2] Chung Ang Univ, Grad Sch Artificial Intelligence, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Action recognition; Temporal modality;
D O I
10.1109/ICCE56470.2023.10043568
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Video action recognition requires accurate analysis of motion information along with spatial information of an object. In other words, it is necessary to learn both temporal and spatial information. In many deep learning-based action recognition methods, temporal and spatial information are extracted by a multi-stream network, where the temporal stream network analyzes the motion information using mathematical operations. In this paper, we present an action recognition method using a multi-stream network with a deep learning-based temporal relation module, which extracts motion information for the entire video in the temporal network path. The proposed method significantly increases the accuracy of action recognition using attached modules in front of the 2D CNN and late fusion with another network path. Owing to the proposed temporal stream network without additional mathematical operations, we could greatly reduces the amount of computation. As a result, the proposed method is suitable for a wide range of real-time visual action recognition tasks.
引用
收藏
页数:3
相关论文
共 50 条
  • [1] Multi-stream CNN: Learning representations based on human-related regions for action recognition
    Tu, Zhigang
    Xie, Wei
    Qin, Qianqing
    Poppe, Ronald
    Veltkamp, Remco C.
    Li, Baoxin
    Yuan, Junsong
    PATTERN RECOGNITION, 2018, 79 : 32 - 43
  • [2] Multimodal Egocentric Activity Recognition Using Multi-stream CNN
    Imran, Javed
    Raman, Balasubramanian
    ELEVENTH INDIAN CONFERENCE ON COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING (ICVGIP 2018), 2018,
  • [3] Multi-Stream Deep Neural Networks for RGB-D Egocentric Action Recognition
    Tang, Yansong
    Wang, Zian
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 3001 - 3015
  • [4] Multimodal Multi-stream Deep Learning for Egocentric Activity Recognition
    Song, Sibo
    Chandrasekhar, Vijay
    Mandal, Bappaditya
    Li, Liyuan
    Lim, Joo-Hwee
    Babu, Giduthuri Sateesh
    San, Phyo Phyo
    Cheung, Ngai-Man
    PROCEEDINGS OF 29TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, (CVPRW 2016), 2016, : 378 - 385
  • [5] Multi-stream 3D CNN structure for human action recognition trained by limited data
    Chenarlogh, Vahid Ashkani
    Razzazi, Farbod
    IET COMPUTER VISION, 2019, 13 (03) : 338 - 344
  • [6] A Multi-View Human Action recognition System in Limited Data case using multi-stream CNN
    Chenarlogh, Vahid Ashkani
    Razzazi, Farbod
    Mohammadyahya, Najmeh
    2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019), 2019,
  • [7] Multi-stream Architecture with Symmetric Extended Visual Rhythms for Deep Learning Human Action Recognition
    Tacon, Hemerson
    Brito, Andre de Souza
    Chaves, Hugo de Lima
    Vieira, Marcelo Bernardes
    Villela, Saulo Moraes
    Maia, Helena de Almeida
    Concha, Darwin Ttito
    Pedrini, Helio
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 351 - 358
  • [8] Image Sequence Based Cyclist Action Recognition Using Multi-Stream 3D Convolution
    Zernetsch, Stefan
    Schreck, Steven
    Kress, Viktor
    Doll, Konrad
    Sick, Bernhard
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2620 - 2626
  • [9] A Multi-Stream Approach to Mixed-Traffic Accident Recognition Using Deep Learning
    Fu, Swee Tee
    Theng, Lau Bee
    Shiong, Brian Loh Chung
    Mccarthy, Chris
    Tsun, Mark Tee Kit
    IEEE ACCESS, 2024, 12 : 185232 - 185249
  • [10] An Investigation of Skeleton-Based Optical Flow-Guided Features for 3D Action Recognition Using a Multi-Stream CNN Model
    Ren, J.
    Reyes, N. H.
    Barczak, A. L. C.
    Scogings, C.
    Liu, M.
    2018 IEEE 3RD INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC), 2018, : 199 - 203