Hierarchical Attention Network for Action Segmentation

被引：4

作者：

Gammulle, Harshala ^{[1
]}

Denman, Simon ^{[1
]}

Sridharan, Sridha ^{[1
]}

Fookes, Clinton ^{[1
]}

机构：

[1] Queensland Univ Technol, SAIVT, Image & Video Res Lab, Brisbane, Qld, Australia

来源：

PATTERN RECOGNITION LETTERS | 2020年 / 131卷

关键词：

Cameras;

D O I：

10.1016/j.patrec.2020.01.023

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Temporal segmentation of events is an essential task and a precursor for the automatic recognition of human actions in the video. Several attempts have been made to capture frame-level salient aspects through attention but they lack the capacity to effectively map the temporal relationships in between the frames as they only capture a limited span of temporal dependencies. To this end we propose a complete end-to-end supervised learning approach that can better learn relationships between actions over time, thus improving the overall segmentation performance. The proposed hierarchical recurrent attention framework analyses the input video at multiple temporal scales, to form embeddings at frame level and segment level, and perform fine-grained action segmentation. This generates a simple, lightweight, yet extremely effective architecture for segmenting continuous video streams and has multiple application domains. We evaluate our system on multiple challenging public benchmark datasets, including MERL Shopping, 50 salads, and Georgia Tech Egocentric datasets and achieves state-of-the-art performance. The evaluated datasets encompass numerous video capture settings which are inclusive of static overhead camera views and dynamic, ego-centric head-mounted camera views, demonstrating the direct applicability of the proposed framework in a variety of settings. (c) 2020 Elsevier B.V. All rights reserved.

引用

页码：442 / 448

页数：7

共 50 条

[21] Refining Action Segmentation with Hierarchical Video Representations
Ahn, Hyemin
Lee, Dongheui
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16282 - 16290
[22] Human Action Segmentation with Hierarchical Supervoxel Consistency
Lu, Jiasen
Xu, Ran
Corso, Jason J.
2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 3762 - 3771
[23] Triple attention network for video segmentation
Tian, Yan
Zhang, Yujie
Zhou, Di
Cheng, Guohua
Chen, Wei-Gang
Wang, Ruili
NEUROCOMPUTING, 2020, 417 (417) : 202 - 211
[24] Bilateral attention network for semantic segmentation
Wang, Dongli
Li, Nanjun
Zhou, Yan
Mu, Jinzhen
IET IMAGE PROCESSING, 2021, 15 (08) : 1607 - 1616
[25] Embedded Attention Network for Semantic Segmentation
Lv, Qingxuan
Feng, Mingzhe
Sun, Xin
Dong, Junyu
Chen, Changrui
Zhang, Yu
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (01): : 326 - 333
[26] CROSS ATTENTION NETWORK FOR SEMANTIC SEGMENTATION
Liu, Mengyu
Yin, Hujun
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2434 - 2438
[27] Dual Attention Network for Scene Segmentation
Fu, Jun
Liu, Jing
Tian, Haijie
Li, Yong
Bao, Yongjun
Fang, Zhiwei
Lu, Hanqing
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
[28] Dynamic attention network for semantic segmentation
Wu, Fei
Chen, Feng
Jing, Xiao-Yuan
Hu, Chang-Hui
Ge, Qi
Ji, Yimu
NEUROCOMPUTING, 2020, 384 (384) : 182 - 191
[29] Shallow Attention Network for Polyp Segmentation
Wei, Jun
Hu, Yiwen
Zhang, Ruimao
Li, Zhen
Zhou, S. Kevin
Cui, Shuguang
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I, 2021, 12901 : 699 - 708
[30] Attention to fine-grained information: hierarchical multi-scale network for retinal vessel segmentation
Chengzhi Lyu
Guoqing Hu
Dan Wang
The Visual Computer, 2022, 38 : 345 - 355

← 1 2 3 4 5 →