Multi-stream 3D CNN structure for human action recognition trained by limited data

被引:25
|
作者
Chenarlogh, Vahid Ashkani [1 ]
Razzazi, Farbod [1 ]
机构
[1] Islamic Azad Univ, Sci & Res Branch, Dept Elect & Comp Engn, Tehran, Iran
关键词
object recognition; image motion analysis; image classification; cameras; feature extraction; learning (artificial intelligence); video signal processing; image sequences; convolutional neural nets; multistream 3D CNN structure; human action recognition; training performance; training data case; optical flows; vertical directions; three-dimensional CNNs; four-stream 3D CNNs; single-stream model; two-stream architecture; four-stream architecture; information channels; separate streams; action recognition system; data set; four-stream structure; convolutional neural network architectures; optical flow; recognition rate; IXMAS; FEATURES;
D O I
10.1049/iet-cvi.2018.5088
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Here, the authors proposed a solution to improve the training performance in limited training data case for human action recognition. The authors proposed three different convolutional neural network (CNN) architectures for this purpose. At first, the authors generated four different channels of information by optical flows and gradients in the horizontal and vertical directions from each frame to apply to three-dimensional (3D) CNNs. Then, the authors proposed three architectures, which are single-stream, two-stream, and four-stream 3D CNNs. In the single-stream model, the authors applied four channels of information from each frame to a single stream. In the two-stream architecture, the authors applied optical flow-x and optical flow-y into one stream and gradient-x and gradient-y to another stream. In the four-stream architecture, the authors applied each one of the information channels to four separate streams. Evaluating the architectures in an action recognition system, the system was assessed on IXMAS, a data set which has been recorded simultaneously by five cameras. The authors showed that the results of four-stream architecture were better than other architectures, achieving 87.5, 91.66, 91.11, 88.05, and 81.94% recognition rates for cameras 0-4, respectively, using four-stream structure (88.05% recognition rate in average).
引用
收藏
页码:338 / 344
页数:7
相关论文
共 50 条
  • [1] A Multi-View Human Action recognition System in Limited Data case using multi-stream CNN
    Chenarlogh, Vahid Ashkani
    Razzazi, Farbod
    Mohammadyahya, Najmeh
    2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019), 2019,
  • [2] Multi-stream CNN for facial expression recognition in limited training data
    Aghamaleki, Javad Abbasi
    Chenarlogh, Vahid Ashkani
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (16) : 22861 - 22882
  • [3] Multi-stream CNN for facial expression recognition in limited training data
    Javad Abbasi Aghamaleki
    Vahid Ashkani Chenarlogh
    Multimedia Tools and Applications, 2019, 78 : 22861 - 22882
  • [4] 3D CNN for Human Action Recognition
    Boualia, Sameh Neili
    Ben Amara, Najoua Essoukri
    2021 18TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2021, : 276 - 282
  • [5] Multi-Stream Interaction Networks for Human Action Recognition
    Wang, Haoran
    Yu, Baosheng
    Li, Jiaqi
    Zhang, Linlin
    Chen, Dongyue
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3050 - 3060
  • [6] Multi-stream CNN: Learning representations based on human-related regions for action recognition
    Tu, Zhigang
    Xie, Wei
    Qin, Qianqing
    Poppe, Ronald
    Veltkamp, Remco C.
    Li, Baoxin
    Yuan, Junsong
    PATTERN RECOGNITION, 2018, 79 : 32 - 43
  • [7] Image Sequence Based Cyclist Action Recognition Using Multi-Stream 3D Convolution
    Zernetsch, Stefan
    Schreck, Steven
    Kress, Viktor
    Doll, Konrad
    Sick, Bernhard
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2620 - 2626
  • [8] An Investigation of Skeleton-Based Optical Flow-Guided Features for 3D Action Recognition Using a Multi-Stream CNN Model
    Ren, J.
    Reyes, N. H.
    Barczak, A. L. C.
    Scogings, C.
    Liu, M.
    2018 IEEE 3RD INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC), 2018, : 199 - 203
  • [9] Recognition of Human Continuous Action with 3D CNN
    Yu, Gang
    Li, Ting
    COMPUTER VISION SYSTEMS, ICVS 2017, 2017, 10528 : 314 - 322
  • [10] An Automatic Estimation of Arterial Input Function Based on Multi-Stream 3D CNN
    Fan, Shengyu
    Bian, Yueyan
    Wang, Erling
    Kang, Yan
    Wang, Danny J. J.
    Yang, Qi
    Ji, Xunming
    FRONTIERS IN NEUROINFORMATICS, 2019, 13