Multi-scale Dynamic Network for Temporal Action Detection

被引:2
|
作者
Ren, Yifan [1 ,2 ]
Xu, Xing [1 ,2 ]
Shen, Fumin [1 ,2 ]
Wang, Zheng [1 ,2 ]
Yang, Yang [1 ,2 ]
Shen, Heng Tao [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Ctr Future Media, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Temporal Action Detection; Dynamic Filters; Multi-scale Features;
D O I
10.1145/3460426.3463613
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, as the fundamental task in video understanding, Temporal Action Detection is attracting extensive attention. Most existing approaches use the same model parameters to process all input videos, which are not adaptive to the input video during the inference stage. In this paper, we propose a novel model termed Multi-scale Dynamic Network (MDN) to tackle this problem. The proposed MDN model incorporates multiple Multi-scale Dynamic Modules (MDMs). Each MDM can generate video-specific and segment-specific convolution kernels based on video content from different scales and adaptively capture rich semantic information for the prediction. Besides, we also design a new Edge Suppression Loss (ESL) function for MDN to pay more attention to hard examples. Extensive experiments conducted on two popular benchmarks ActivityNet-1.3 and THUMOS-14 show that the proposed MDN model achieves the state-of-the-art performance.
引用
收藏
页码:267 / 275
页数:9
相关论文
共 50 条
  • [21] MSDFEN: Multi-scale dynamic feature extraction network for pathological voice detection
    Dai, Zhiyuan
    Jiang, Yuyang
    Cao, Laiyuan
    Zhang, Xiaojun
    Tao, Zhi
    APPLIED ACOUSTICS, 2025, 230
  • [22] Multi-Scale Receptive Field Detection Network
    Cui, Haoren
    Wei, Zhihua
    IEEE ACCESS, 2019, 7 : 138825 - 138832
  • [23] Detection of multi-scale clusters in network space
    Shiode, Shino
    Shiode, Narushige
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2009, 23 (01) : 75 - 92
  • [24] One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features
    Nguyen, Trung Thanh
    Kawanishi, Yasutomo
    Komamizu, Takahiro
    Ide, Ichiro
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [25] A Multi-Scale Video Longformer Network for Action Recognition
    Chen, Congping
    Zhang, Chunsheng
    Dong, Xin
    APPLIED SCIENCES-BASEL, 2024, 14 (03):
  • [26] Multi-scale spatial–temporal convolutional neural network for skeleton-based action recognition
    Qin Cheng
    Jun Cheng
    Ziliang Ren
    Qieshi Zhang
    Jianming Liu
    Pattern Analysis and Applications, 2023, 26 (3) : 1303 - 1315
  • [27] Skeleton-weighted and multi-scale temporal-driven network for video action recognition
    Xu, Ziqi
    Zhang, Jie
    Zhang, Peng
    Ding, Pengfei
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (06)
  • [28] Multi-Scale Spatial Temporal Graph Neural Network for Skeleton-Based Action Recognition
    Feng, Dong
    Wu, ZhongCheng
    Zhang, Jun
    Ren, TingTing
    IEEE ACCESS, 2021, 9 : 58256 - 58265
  • [29] Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition
    Chen, Zhan
    Li, Sicheng
    Yang, Bing
    Li, Qinghan
    LiU, Hong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1113 - 1122
  • [30] Multi-scale interaction transformer for temporal action proposal generation
    Shang, Jiahui
    Wei, Ping
    Li, Huan
    Zheng, Nanning
    IMAGE AND VISION COMPUTING, 2023, 129