Multigranularity Feature Aggregation and Cross-level Boundary Modeling for Temporal Action Detection

被引:0
|
作者
Li, Qiang [1 ,2 ]
Liu, Di [1 ,3 ]
Zu, Guang [4 ]
Li, Sen [1 ]
Sun, Hui [2 ]
Wang, Jianzhong [1 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun, Peoples R China
[2] Changchun Humanities & Sci Coll, Changchun, Peoples R China
[3] Northeast Elect Power Univ, Jilin, Peoples R China
[4] Jilin Univ, Sch Artificial Intelligence, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
Temporal action detection; action recognition; vision transformers; TRANSFORMER;
D O I
10.1145/3712598
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article presents a Temporal Action Detection (TAD) method with Multigranularity (MG) feature aggregation and Cross-level Boundary Modeling (CBM). Compared with other methods, our proposed approach has the following advantages. First, different from most existing works which only consider the local temporal context, a simple and computationally efficient MG module is proposed to comprehensively extract video features in instant, local, and global temporal granularities. Second, unlike the methods that only employ the information from single feature pyramid level for action boundary regression, a CBM strategy that integrates the relative information from both the same and higher level features is designed to improve the accuracy of boundary prediction. At lastfere, benefiting from the MG module and CBM strategy, our method outperforms other state-of-the-art approaches on five challenging TAD datasets: THUMOS14, MultiTHUMOS, EPIC-KITCHENS-100, ActivityNet-1.3, and HACS. We make our code and pre-trained model publicly available CCS Concepts: center dot Computing methodologies -> Artificial intelligence; Computer vision tasks; Activity recognition and understanding
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Multi-Level Content-Aware Boundary Detection for Temporal Action Proposal Generation
    Su, Taiyi
    Wang, Hanli
    Wang, Lei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6090 - 6101
  • [42] Spatial-Temporal Skeleton Feature: An Unit-Level Feature for Temporal Action Proposal Generation
    Chen, Tingting
    Dong, Junyu
    Qi, Lin
    Zhang, Shu
    Wang, Xiang
    Zhao, Qilu
    2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 389 - 394
  • [43] Camouflaged Object Detection via Context-Aware Cross-Level Fusion
    Chen, Geng
    Liu, Si-Jie
    Sun, Yu-Jia
    Ji, Ge-Peng
    Wu, Ya-Feng
    Zhou, Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6981 - 6993
  • [44] Context-aware Cross-level Fusion Network for Camouflaged Object Detection
    Sun, Yujia
    Chen, Geng
    Zhou, Tao
    Zhang, Yi
    Liu, Nian
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1025 - 1031
  • [45] Boundary graph convolutional network for temporal action detection
    Chen, Yaosen
    Guo, Bing
    Shen, Yan
    Wang, Wei
    Lu, Weichen
    Suo, Xinhua
    IMAGE AND VISION COMPUTING, 2021, 109
  • [46] BTM: Boundary Trimming Module for Temporal Action Detection
    Hamdi, Maher
    Wen, Shiping
    Yang, Yin
    ELECTRONICS, 2022, 11 (21)
  • [47] Progressive Boundary Refinement Network for Temporal Action Detection
    Liu, Qinying
    Wang, Zilei
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11612 - 11619
  • [48] A Frame Level Feature Aggregation Method for Video target Detection
    Guo, Jun
    Liu, Wenfeng
    Xin, Shijie
    Zhao, Zixuan
    Zhang, Bin
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 1368 - 1373
  • [49] CAA: Candidate-Aware Aggregation for Temporal Action Detection
    Ren, Yifan
    Xu, Xing
    Shen, Fumin
    Yao, Yazhou
    Lu, Huimin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4930 - 4938
  • [50] Feature Aggregation Tree: Capture Temporal Motion Information for Action Recognition in Videos
    Zhu, Bing
    PATTERN RECOGNITION AND COMPUTER VISION, PT III, 2018, 11258 : 316 - 327