Hierarchical Attention Network for Action Segmentation

被引:4
|
作者
Gammulle, Harshala [1 ]
Denman, Simon [1 ]
Sridharan, Sridha [1 ]
Fookes, Clinton [1 ]
机构
[1] Queensland Univ Technol, SAIVT, Image & Video Res Lab, Brisbane, Qld, Australia
关键词
Cameras;
D O I
10.1016/j.patrec.2020.01.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal segmentation of events is an essential task and a precursor for the automatic recognition of human actions in the video. Several attempts have been made to capture frame-level salient aspects through attention but they lack the capacity to effectively map the temporal relationships in between the frames as they only capture a limited span of temporal dependencies. To this end we propose a complete end-to-end supervised learning approach that can better learn relationships between actions over time, thus improving the overall segmentation performance. The proposed hierarchical recurrent attention framework analyses the input video at multiple temporal scales, to form embeddings at frame level and segment level, and perform fine-grained action segmentation. This generates a simple, lightweight, yet extremely effective architecture for segmenting continuous video streams and has multiple application domains. We evaluate our system on multiple challenging public benchmark datasets, including MERL Shopping, 50 salads, and Georgia Tech Egocentric datasets and achieves state-of-the-art performance. The evaluated datasets encompass numerous video capture settings which are inclusive of static overhead camera views and dynamic, ego-centric head-mounted camera views, demonstrating the direct applicability of the proposed framework in a variety of settings. (c) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:442 / 448
页数:7
相关论文
共 50 条
  • [21] Refining Action Segmentation with Hierarchical Video Representations
    Ahn, Hyemin
    Lee, Dongheui
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16282 - 16290
  • [22] Human Action Segmentation with Hierarchical Supervoxel Consistency
    Lu, Jiasen
    Xu, Ran
    Corso, Jason J.
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 3762 - 3771
  • [23] Triple attention network for video segmentation
    Tian, Yan
    Zhang, Yujie
    Zhou, Di
    Cheng, Guohua
    Chen, Wei-Gang
    Wang, Ruili
    NEUROCOMPUTING, 2020, 417 (417) : 202 - 211
  • [24] Bilateral attention network for semantic segmentation
    Wang, Dongli
    Li, Nanjun
    Zhou, Yan
    Mu, Jinzhen
    IET IMAGE PROCESSING, 2021, 15 (08) : 1607 - 1616
  • [25] Embedded Attention Network for Semantic Segmentation
    Lv, Qingxuan
    Feng, Mingzhe
    Sun, Xin
    Dong, Junyu
    Chen, Changrui
    Zhang, Yu
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (01): : 326 - 333
  • [26] CROSS ATTENTION NETWORK FOR SEMANTIC SEGMENTATION
    Liu, Mengyu
    Yin, Hujun
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2434 - 2438
  • [27] Dual Attention Network for Scene Segmentation
    Fu, Jun
    Liu, Jing
    Tian, Haijie
    Li, Yong
    Bao, Yongjun
    Fang, Zhiwei
    Lu, Hanqing
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
  • [28] Dynamic attention network for semantic segmentation
    Wu, Fei
    Chen, Feng
    Jing, Xiao-Yuan
    Hu, Chang-Hui
    Ge, Qi
    Ji, Yimu
    NEUROCOMPUTING, 2020, 384 (384) : 182 - 191
  • [29] Shallow Attention Network for Polyp Segmentation
    Wei, Jun
    Hu, Yiwen
    Zhang, Ruimao
    Li, Zhen
    Zhou, S. Kevin
    Cui, Shuguang
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I, 2021, 12901 : 699 - 708
  • [30] Attention to fine-grained information: hierarchical multi-scale network for retinal vessel segmentation
    Chengzhi Lyu
    Guoqing Hu
    Dan Wang
    The Visual Computer, 2022, 38 : 345 - 355