Multigranularity Feature Aggregation and Cross-level Boundary Modeling for Temporal Action Detection

被引:0
|
作者
Li, Qiang [1 ,2 ]
Liu, Di [1 ,3 ]
Zu, Guang [4 ]
Li, Sen [1 ]
Sun, Hui [2 ]
Wang, Jianzhong [1 ]
机构
[1] Northeast Normal Univ, Sch Informat Sci & Technol, Changchun, Peoples R China
[2] Changchun Humanities & Sci Coll, Changchun, Peoples R China
[3] Northeast Elect Power Univ, Jilin, Peoples R China
[4] Jilin Univ, Sch Artificial Intelligence, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
Temporal action detection; action recognition; vision transformers; TRANSFORMER;
D O I
10.1145/3712598
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article presents a Temporal Action Detection (TAD) method with Multigranularity (MG) feature aggregation and Cross-level Boundary Modeling (CBM). Compared with other methods, our proposed approach has the following advantages. First, different from most existing works which only consider the local temporal context, a simple and computationally efficient MG module is proposed to comprehensively extract video features in instant, local, and global temporal granularities. Second, unlike the methods that only employ the information from single feature pyramid level for action boundary regression, a CBM strategy that integrates the relative information from both the same and higher level features is designed to improve the accuracy of boundary prediction. At lastfere, benefiting from the MG module and CBM strategy, our method outperforms other state-of-the-art approaches on five challenging TAD datasets: THUMOS14, MultiTHUMOS, EPIC-KITCHENS-100, ActivityNet-1.3, and HACS. We make our code and pre-trained model publicly available CCS Concepts: center dot Computing methodologies -> Artificial intelligence; Computer vision tasks; Activity recognition and understanding
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Cross-Level Attentive Feature Aggregation for Change Detection
    Wang, Guangxing
    Cheng, Gong
    Zhou, Peicheng
    Han, Junwei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 6051 - 6062
  • [2] Cross-level Feature Aggregation Network for Polyp Segmentation
    Zhou, Tao
    Zhou, Yi
    He, Kelei
    Gong, Chen
    Yang, Jian
    Fu, Huazhu
    Shen, Dinggang
    PATTERN RECOGNITION, 2023, 140
  • [3] DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
    Yang, Le
    Zheng, Ziwei
    Han, Yizeng
    Cheng, Hao
    Song, Shiji
    Huang, Gao
    Li, Fan
    COMPUTER VISION-ECCV 2024, PT XLVI, 2025, 15104 : 305 - 322
  • [4] Cross-level feature aggregation image enhancement with dual-path hybrid attention
    Yuan H.
    Wang X.
    Yan T.
    Zhang S.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2024, 32 (10): : 1538 - 1551
  • [5] Defocus blur detection via adaptive cross-level feature fusion and refinement
    Zhao, Zijian
    Yang, Hang
    Liu, Peiyu
    Nie, Haitao
    Zhang, Zhongbo
    Li, Chunyu
    VISUAL COMPUTER, 2024, 40 (11): : 8141 - 8153
  • [6] Exploring Feature Compensation and Cross-level Correlation for Infrared Small Target Detection
    Zhang, Mingjin
    Yue, Ke
    Zhang, Jing
    Li, Yunsong
    Gao, Xinbo
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1857 - 1865
  • [7] TriDet: Temporal Action Detection with Relative Boundary Modeling
    Shi, Dingfeng
    Zhong, Yujie
    Cao, Qiong
    Ma, Lin
    Li, Jia
    Tao, Dacheng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18857 - 18866
  • [8] Face forgery detection with cross-level attention
    Liu, Yaju
    Fei, Jianwei
    Yu, Peipeng
    Yuan, Chengsheng
    Liang, Haopeng
    INTERNATIONAL JOURNAL OF AUTONOMOUS AND ADAPTIVE COMMUNICATIONS SYSTEMS, 2024, 17 (03) : 233 - 246
  • [9] CDNet: object detection based on cross-level aggregation and deformable attention for UAV aerial images
    Huo, Tianxiang
    Liu, Zhenqi
    Zhang, Shichao
    Wu, Jiening
    Yuan, Rui
    Duan, Shukai
    Wang, Lidan
    VISUAL COMPUTER, 2024, : 4603 - 4621
  • [10] GACFNet: A global attention cross-level feature fusion network for aerial image object detection
    Liang, Xingzhu
    Li, Mengyuan
    Lin, Yu-e
    Fang, Xianjin
    COMPUTERS & ELECTRICAL ENGINEERING, 2025, 123