MINet: Meta-Learning Instance Identifiers for Video Object Detection

被引:13
|
作者
Deng, Jiajun [1 ]
Pan, Yingwei [2 ]
Yao, Ting [2 ]
Zhou, Wengang [1 ]
Li, Houqiang [1 ]
Mei, Tao [2 ]
机构
[1] Univ Sci & Technol China USTC, Dept Elect Engn & Informat Sci, Hefei 230026, Peoples R China
[2] JD AI Res, Beijing 100105, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Feature extraction; Detectors; Proposals; Optical imaging; Robustness; History; Video object detection; meta learning; memory network; box association; NETWORKS;
D O I
10.1109/TIP.2021.3099409
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in video object detection have characterized the exploration of temporal coherence across frames to enhance object detector. Nevertheless, previous solutions either rely on additional inputs (e.g., optical flow) to guide feature aggregation, or complex post-processing to associate bounding boxes. In this paper, we introduce a simple but effective design that learns instance identifiers for instance association in a meta-learning paradigm, which requires no auxiliary inputs or post-processing. Specifically, we present Meta-Learnt Instance Identifier Networks (namely MINet) that novelly meta-learns instance identifiers to recognize identical instances across frames in a single forward-pass, leading to the robust online linking of instances. Technically, depending on the detection results of previous frames, we teach MINet to learn the weights of an instance identifier on the fly, which can be well applied to up-coming frames. Such meta-learning paradigm enables instance identifiers to be flexibly adapted to novel frames at inference. Furthermore, MINet writes/updates the detection results of previous instances into memory and reads from memory when performing inference to encourage temporal consistency for video object detection. Our MINet is appealing in the sense that it is pluggable to any object detection model. Extensive experiments on ImageNet VID dataset demonstrate the superiority of MINet. More remarkably, by integrating MINet into Faster R-CNN, we achieve 80.2% mAP on ImageNet VID dataset.
引用
收藏
页码:6879 / 6891
页数:13
相关论文
共 50 条
  • [31] Federated Meta-Learning for Fraudulent Credit Card Detection
    Zheng, Wenbo
    Yan, Lan
    Gou, Chao
    Wang, Fei-Yue
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 4654 - 4660
  • [32] Active Meta-Learning with Uncertainty Sampling and Outlier Detection
    Prudencio, Ricardo B. C.
    Ludermir, Teresa B.
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 346 - 351
  • [33] Meta-Learning Based Classification for Moving Object Trajectories in Mobile IoT
    Chen, Yuanyi
    Yu, Peng
    Chen, Wenwang
    Zheng, Zengwei
    Guo, Minyi
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (02) : 584 - 596
  • [34] On the Effectiveness of Sentence Encoding for Intent Detection Meta-Learning
    Ma, Tingting
    Wu, Qianhui
    Yu, Zhiwei
    Zhao, Tiejun
    Lin, Chin-Yew
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3806 - 3818
  • [35] Discrepant multiple instance learning for weakly supervised object detection
    Gao, Wei
    Wan, Fang
    Yue, Jun
    Xu, Songcen
    Ye, Qixiang
    PATTERN RECOGNITION, 2022, 122
  • [36] INSTANCE-AWARE UNCERTAINTY FOR ACTIVE LEARNING IN OBJECT DETECTION
    Zhang, Zhipeng
    Ma, Wenting
    Yuan, Xiaohang
    Hao, Yuan
    Guo, Meng
    Tang, Hongyi
    Zhou, Zhiheng
    Yao, Zhenjie
    2024 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2024, : 298 - 304
  • [37] Combining Deep Learning and Verification for Precise Object Instance Detection
    Ancha, Siddharth
    Nan, Junyu
    Held, David
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [38] Meta-learning in Reinforcement Learning
    Schweighofer, N
    Doya, K
    NEURAL NETWORKS, 2003, 16 (01) : 5 - 9
  • [39] Harnessing Meta-Learning for Improving Full-Frame Video Stabilization
    Alin, Muhammad Kashif
    Im, Eun Woo
    Kim, Dongjin
    Kim, Tae Hyun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 12605 - 12614
  • [40] Learning to Forget for Meta-Learning
    Baik, Sungyong
    Hong, Seokil
    Lee, Kyoung Mu
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2376 - 2384