Classification Matters: Improving Video Action Detection with Class-Specific Attention

被引:0
|
作者
Lee, Jinsung [1 ,2 ]
Kim, Taeoh [2 ]
Lee, Inwoong [2 ]
Shim, Minho [2 ]
Wee, Dongyoon [2 ]
Cho, Minsu [1 ]
Kwak, Suha [1 ]
机构
[1] Pohang Univ Sci & Technol POSTECH, Pohang Si, South Korea
[2] NAVER Cloud, Seongnam Si, South Korea
来源
关键词
Video action detection; Video transformer;
D O I
10.1007/978-3-031-72661-3_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video action detection (VAD) aims to detect actors and classify their actions in a video. We figure that VAD suffers more from classification rather than localization of actors. Hence, we analyze how prevailing methods form features for classification and find that they prioritize actor regions, yet often overlooking the essential contextual information necessary for accurate classification. Accordingly, we propose to reduce the bias toward actor and encourage paying attention to the context that is relevant to each action class. By assigning a class-dedicated query to each action class, our model can dynamically determine where to focus for effective classification. The proposed model demonstrates superior performance on three challenging benchmarks with significantly fewer parameters and less computation.
引用
收藏
页码:450 / 467
页数:18
相关论文
共 50 条
  • [1] Class-specific feature sets in classification
    Baggenstoss, PM
    JOINT CONFERENCE ON THE SCIENCE AND TECHNOLOGY OF INTELLIGENT SYSTEMS, 1998, : 413 - 416
  • [2] Class-specific feature sets in classification
    Bagenstoss, PM
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1999, 47 (12) : 3428 - 3432
  • [3] MINING HETEROGENEOUS CLASS-SPECIFIC CODEBOOK FOR CATEGORICAL OBJECT DETECTION AND CLASSIFICATION
    Pan, Hong
    Zhu, Yaping
    Qin, A. K.
    Xia, Liangzheng
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 3132 - 3136
  • [4] Sufficiency classification, and the class-specific feature theorem
    Kay, S
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2000, 46 (04) : 1654 - 1658
  • [5] CLASS-SPECIFIC CHANNEL ATTENTION FOR FEW SHOT LEARNING
    Hsieh, Yi-Kuan
    Hsieh, Jun-Wei
    Chen, Ying-Yu
    2024 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2024, : 1012 - 1018
  • [6] Class-Specific Neural Network for Video Compressed Sensing
    Pei, Yifei
    Liu, Ying
    Ling, Nam
    Liu, Lingzhi
    Ren, Yongxiong
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [7] Mining for class-specific motifs in protein sequence classification
    Satish M Srinivasan
    Suleyman Vural
    Brian R King
    Chittibabu Guda
    BMC Bioinformatics, 14
  • [8] Class-Specific Hough Forests for Object Detection
    Gall, Juergen
    Lempitsky, Victor
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 1022 - +
  • [9] Audio Classification Using Class-Specific Learned Descriptors
    Sonowal, Sukanya
    Sandhan, Tushar
    Choi, Inkyu
    Kim, Nam Soo
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 484 - 487
  • [10] LEARNING CLASS-SPECIFIC POOLING SHAPES FOR IMAGE CLASSIFICATION
    Wang, Jinzhuo
    Wang, Wenmin
    Wang, Ronggang
    Gao, Wen
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,