FEXNet: Foreground Extraction Network for Human Action Recognition

被引:29
|
作者
Shen, Zhongwei [1 ]
Wu, Xiao-Jun [1 ]
Xu, Tianyang [1 ,2 ]
机构
[1] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, Wuxi 214122, Jiangsu, Peoples R China
[2] Univ Surrey, Ctr Vis Speech & Signal Proc, Guildford GU2 7XH, Surrey, England
基金
中国国家自然科学基金;
关键词
Convolutional neural networks; Spatiotemporal phenomena; Feature extraction; Three-dimensional displays; Solid modeling; Iron; Image recognition; Foreground-related features; spatiotemporal modeling; action recognition;
D O I
10.1109/TCSVT.2021.3103677
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As most human actions in video sequences embody the continuous interactions between foregrounds rather than the background scene, it is significant to disentangle these foregrounds from the background for advanced action recognition systems. In this paper, therefore, we propose a Foreground EXtraction (FEX) block to explicitly model the foreground clues to achieve effective management of action subjects. In particular, the designed FEX block contains two components. The first part is a Foreground Enhancement (FE) module, which highlights the potential feature channels related to the action attributes, providing channel-level refinement for the following spatiotemporal modeling. The second phase is a Scene Segregation (SS) module, which splits feature maps into foreground and background. Specifically, a temporal model with dynamic enhancement is constructed for the foreground part, reflecting the essential nature of the action category. While the background is modeled using simple spatial convolutions, mapping the inputs to the consistent feature space. The FEX blocks can be inserted into existing 2D CNNs (denoted as FEXNet) for spatiotemporal modeling, concentrating on the foreground clues for effective action inference. Our experiments performed on Something-Something V1, V2 and Kinetics400 verify the effectiveness of the proposed method.
引用
收藏
页码:3141 / 3151
页数:11
相关论文
共 50 条
  • [41] A Foreground Extraction Approach Using Convolutional Neural Network with Graph Cut
    Utah, Matee
    Iltaf, Adnan
    Hou, Qiujun
    Ali, Farman
    Liu, Chuancai
    2018 IEEE 3RD INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC), 2018, : 40 - 44
  • [42] FCGNet: Foreground and Class Guided Network for human parsing
    Jang, Jaehyuk
    Wang, Yooseung
    Kim, Changick
    PATTERN RECOGNITION, 2025, 157
  • [43] Object Recognition Based on a Foreground Extraction Method Under Simulated Prosthetic Vision
    Han, Tingting
    Li, Heng
    Lyu, Qing
    Zeng, Yajie
    Chai, Xinyu
    2015 INTERNATIONAL SYMPOSIUM ON BIOELECTRONICS AND BIOINFORMATICS (ISBB), 2015, : 172 - 175
  • [44] Feature Extraction and Representation for Distributed Multi-View Human Action Recognition
    Luo, Jiajia
    Wang, Wei
    Qi, Hairong
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2013, 3 (02) : 145 - 154
  • [45] Action recognition method based on lightweight network and rough-fine keyframe extraction
    Pan, Hao
    Tian, Qiuhong
    Li, Saiwei
    Miao, Weilun
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 97
  • [46] Video spatiotemporal mapping for human action recognition by convolutional neural network
    Zare, Amin
    Abrishami Moghaddam, Hamid
    Sharifi, Arash
    PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (01) : 265 - 279
  • [47] Residual Non-degenerate Temporal Network for Human Action Recognition
    Ming, Shaofeng
    Cai, Qiang
    Li, Haisheng
    Liu, Xinliang
    Gao, Cui
    Li, Wan
    2020 IEEE 6th International Conference on Computer and Communications, ICCC 2020, 2020, : 1415 - 1421
  • [48] Improving Human Action Recognition through Hierarchical Neural Network Classifiers
    Zhdanov, Pavel
    Khan, Adil
    Rivera, Adin Ramirez
    Khattak, Asad Masood
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [49] Human Action Recognition Network Based on Improved Channel Attention Mechanism
    Chen Ying
    Gong Suming
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (12) : 3538 - 3545
  • [50] Human Action Recognition based on Simple Deep Convolution Network PCANet
    Abdelbaky, Amany
    Aly, Saleh
    PROCEEDINGS OF 2020 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN COMMUNICATION AND COMPUTER ENGINEERING (ITCE), 2020, : 257 - 262