Temporal Deformable Transformer for Action Localization

被引：0

作者：

Wang, Haoying ^{[1
]}

Wei, Ping ^{[1
]}

Liu, Meiqin ^{[1
]}

Zheng, Nanning ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI | 2023年 / 14259卷

基金：

中国国家自然科学基金;

关键词：

Temporal Action Localization; Transformer; Deformable Attention; Video Understanding;

D O I：

10.1007/978-3-031-44223-0_45

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Temporal action localization (TAL) is a challenging task that has received significant attention in video understanding. Recently, Transformer-based models have demonstrated their effectiveness in capturing contextual information and achieved outstanding performance on various TAL benchmarks. However, these methods still face challenges in computational efficiency and contextual modeling rigidity. In this paper, we propose a method to address those problems in Transformer-based models. Our model introduces a temporal deformable Transformer module and the corresponding time normalization, enabling flexible aggregation of temporal context information in videos, leading to enhanced video representations. To demonstrate the effectiveness of the proposed method, we construct a Transformer-based anchor-free model with a simple prediction head, which yields superior performance on widely used benchmarks. Specifically, it achieves an average mAP of 67.4% on THUMOS14 and an average mAP of 36.8% on ActivityNet-v1.3.

引用

页码：563 / 575

页数：13

共 50 条

[41] Probabilistic Temporal Modeling for Unintentional Action Localization
Xu, Jinglin
Chen, Guangyi
Zhou, Nuoxing
Zheng, Wei-Shi
Lu, Jiwen
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3081 - 3094
[42] Gaussian Temporal Awareness Networks for Action Localization
Long, Fuchen
Yao, Ting
Qiu, Zhaofan
Tian, Xinmei
Luo, Jiebo
Mei, Tao
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 344 - 353
[43] Action Shuffling for Weakly Supervised Temporal Localization
Zhang, Xiao-Yu
Shi, Haichao
Li, Changsheng
Shi, Xinchu
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4447 - 4457
[44] Dual relation network for temporal action localization
Xia, Kun
Wang, Le
Zhou, Sanping
Hua, Gang
Tang, Wei
PATTERN RECOGNITION, 2022, 129
[45] Temporal Dropout for Weakly Supervised Action Localization
Xie, Chi
Zhuang, Zikun
Zhao, Shengjie
Liang, Shuang
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (03)
[46] Frame Segmentation Networks for Temporal Action Localization
Yang, Ke
Qiao, Peng
Wang, Qiang
Li, Shijie
Niu, Xin
Li, Dongsheng
Dou, Yong
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 242 - 252
[47] Temporal Superpixels based Human Action Localization
Ullah, Sami
Hassan, Najmul
Bhatti, Naeem
2018 14TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES (ICET), 2018,
[48] TVNet: Temporal Voting Network for Action Localization
Wang, Hanyuan
Damen, Dima
Mirmehdi, Majid
Perrett, Toby
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 550 - 558
[49] Revisiting Anchor Mechanisms for Temporal Action Localization
Yang, Le
Peng, Houwen
Zhang, Dingwen
Fu, Jianlong
Han, Junwei
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8535 - 8548
[50] A Temporal-Aware Relation and Attention Network for Temporal Action Localization
Zhao, Yibo
Zhang, Hua
Gao, Zan
Guan, Weili
Nie, Jie
Liu, Anan
Wang, Meng
Chen, Shengyong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4746 - 4760

← 1 2 3 4 5 →