Class-Aware Dual-Supervised Aggregation Network for Video Object Detection

被引:1
|
作者
Qi, Qiang [1 ,2 ]
Yan, Yan [1 ,2 ]
Wang, Hanzi [1 ,2 ]
机构
[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen 361005, Peoples R China
[2] Xiamen Univ, Key Lab Multimedia Trusted Percept & Efficient Com, Minist Educ China, Xiamen 361005, Peoples R China
基金
中国国家自然科学基金;
关键词
Video object detection; distillation supervision; graph-guided feature aggregation; contrastive supervision; NAVIGATION;
D O I
10.1109/TMM.2023.3292615
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video object detection has attracted increasing attention in recent years. Although great success has been achieved by off-the-shelf video object detection methods through delicately designing various types of feature aggregation, they overlook the class-aware supervision and thus still suffer from the problem of classification incapability, which means the classification between objects with deteriorated or similar appearances is error-prone. In this article, we propose a novel class-aware dual-supervised aggregation network (CDANet) for video object detection, including three substantial improvements to effectively alleviate the classification incapability problem of previous methods. First, we develop a class-aware cross-modality distillation supervision that transfers the semantic knowledge of label data to the features of video data, effectively enhancing the semantic representations of features. Second, we design a graph-guided feature aggregation module that effectively models the structural relations between features by leveraging the dynamic residual graph convolutional network, enabling our CDANet to perform more effective feature aggregation in the temporal domain. Third, we present a class-aware proposal contrastive supervision to maximize the intra-class agreement and inter-class disagreement, which is conducive to improving the semantic discriminability of features. The class-aware dual supervision and feature aggregation are tightly tied into a unified end-to-end framework to make our CDANet fully exploit class-specific semantic knowledge and inter-frame temporal dependencies to enhance object appearance representations, which facilitates the classification of detected objects. We conduct experiments on the challenging ImageNet VID dataset, and the results demonstrate the superiority of our CDANet against state-of-the-art methods. More remarkably, our CDANet achieves 85.4% mAP with ResNet-101 or 86.5% mAP with ResNeXt-101.
引用
收藏
页码:2109 / 2123
页数:15
相关论文
共 50 条
  • [1] Class-Aware Feature Aggregation Network for Video Object Detection
    Han, Liang
    Wang, Pichao
    Yin, Zhaozheng
    Wang, Fan
    Li, Hao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8165 - 8178
  • [2] Class-aware Object Counting
    Michel, Andreas
    Gross, Wolfgang
    Schenkel, Fabian
    Middelmann, Wolfgang
    2022 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW 2022), 2022, : 469 - 478
  • [3] Class-Aware Robust Adversarial Training for Object Detection
    Chen, Pin-Chun
    Kung, Bo-Han
    Chen, Jun-Cheng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10415 - 10424
  • [4] Dual Class-Aware Contrastive Federated Semi-Supervised Learning
    Guo, Qi
    Wu, Di
    Qi, Yong
    Qi, Saiyu
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2025, 24 (02) : 1073 - 1089
  • [5] A difference enhancement and class-aware rebalancing semi-supervised network for cropland semantic change detection
    Dai, Anjin
    Yang, Jianyu
    Zhang, Yuxuan
    Zhang, Tingting
    Tang, Kaixuan
    Xiao, Xiangyi
    Zhang, Shuoji
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2025, 137
  • [6] CASN: Class-Aware Score Network for Textual Adversarial Detection
    Bao, Rong
    Zheng, Rui
    Ding, Liang
    Zhang, Qi
    Tao, Dacheng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 671 - 687
  • [7] Class-Aware Contrastive Semi-Supervised Learning
    Yang, Fan
    Wu, Kai
    Zhang, Shuyi
    Jiang, Guannan
    Liu, Yong
    Zheng, Feng
    Zhang, Wei
    Wang, Chengjie
    Zeng, Long
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 14401 - 14410
  • [8] DUALFEAT: DUAL FEATURE AGGREGATION FOR VIDEO OBJECT DETECTION
    Pan, Jing
    Du, Kaiwen
    Yan, Yan
    Wang, Hanzi
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2901 - 2905
  • [9] Layerwise Class-Aware Convolutional Neural Network
    Cui, Zhen
    Niu, Zhiheng
    Liu, Luoqi
    Yan, Shuicheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (12) : 2601 - 2612
  • [10] Class-aware Memory Guided Unbiased Weighting for Universal Domain Adaptive Object Detection
    Lang, Qinghai
    He, Zhenwei
    Fu, Xiaowei
    Zhang, Lei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 4347 - 4356