Label-efficient object detection via region proposal network pre-training

被引:3
|
作者
Dong, Nanqing [1 ]
Ericsson, Linus [2 ]
Yang, Yongxin [3 ]
Leonardis, Ales [4 ]
Mcdonagh, Steven [2 ]
机构
[1] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
[2] Univ Edinburgh, Inst Imaging Data & Commun IDCOM, Sch Engn, Edinburgh EH9 3FG, Scotland
[3] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England
[4] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, England
关键词
Self-supervised learning; Object detection;
D O I
10.1016/j.neucom.2024.127376
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self -supervised pre -training, based on the pretext task of instance discrimination, has fuelled the recent advance in label -efficient object detection. However, existing studies focus on pre -training only a feature extractor network to learn transferable representations for downstream detection tasks. This leads to the necessity of training multiple detection -specific modules from scratch in the fine-tuning phase. We argue that the region proposal network (RPN), a common detection -specific module, can additionally be pre -trained towards reducing the localization error of multi -stage detectors. In this work, we propose a simple pretext task that provides an effective pre -training for the RPN, towards efficiently improving downstream object detection performance. We evaluate the efficacy of our approach on benchmark object detection tasks and additional downstream tasks, including instance segmentation and few -shot detection. In comparison with multi -stage detectors without RPN pre -training, our approach is able to consistently improve downstream task performance, with largest gains found in label -scarce settings.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Label-Efficient Domain Generalization via Collaborative Exploration and Generalization
    Yuan, Junkun
    Ma, Xu
    Chen, Defang
    Kuang, Kun
    Wu, Fei
    Lin, Lanfen
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2361 - 2370
  • [22] Efficient Small Object Detection with an Improved Region Proposal Networks
    Ma, DongWen
    Wu, XiaoJun
    Yang, Honghong
    2019 THE 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, CONTROL AND ROBOTICS (EECR 2019), 2019, 533
  • [23] LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization
    Liu, Sheng
    Huynh, Cong Phuoc
    Chen, Cong
    Arap, Maxim
    Hamid, Raffay
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18290 - 18299
  • [24] SODAWideNet - Salient Object Detection with an Attention Augmented Wide Encoder Decoder Network Without ImageNet Pre-training
    Dulam, Rohit Venkata Sai
    Kambhamettu, Chandra
    ADVANCES IN VISUAL COMPUTING, ISVC 2023, PT II, 2023, 14362 : 93 - 105
  • [25] Dictionary Temporal Graph Network via Pre-training Embedding Distillation
    Liu, Yipeng
    Zheng, Fang
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VI, ICIC 2024, 2024, 14880 : 336 - 347
  • [26] Hierarchical objectness network for region proposal generation and object detection
    Wang, Juan
    Tao, Xiaoming
    Xu, Mai
    Duan, Yiping
    Lu, Jianhua
    PATTERN RECOGNITION, 2018, 83 : 260 - 272
  • [27] REGION PROPOSAL RANKING VIA FUSION FEATURE FOR OBJECT DETECTION
    Li, Xi
    Ma, Huimin
    Wang, Xiang
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1298 - 1302
  • [28] Unsupervised Pre-Training for Detection Transformers
    Dai, Zhigang
    Cai, Bolun
    Lin, Yugeng
    Chen, Junying
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 12772 - 12782
  • [29] Always be Pre-Training: Representation Learning for Network Intrusion Detection with GNNs
    Gu, Zhengyao
    Lopez, Diego Troy
    Alrahis, Lilas
    Sinanoglu, Ozgur
    2024 25TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED 2024, 2024,
  • [30] Unifying Event Detection and Captioning as Sequence Generation via Pre-training
    Zhang, Qi
    Song, Yuqing
    Jin, Qin
    COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 363 - 379