Exploring Spatial-Temporal Instance Relationships in an Intermediate Domain for Image-to-Video Object Detection

被引:0
|
作者
Wen, Zihan [1 ]
Chen, Jin [1 ]
Wu, Xinxiao [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing Lab Intelligent Informat Technol, Beijing, Peoples R China
来源
关键词
Deep learning; Object detection; Domain adaptation;
D O I
10.1007/978-3-031-27066-6_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image-to-video object detection leverages annotated images to help detect objects in unannotated videos, so as to break the heavy dependency on the expensive annotation of large-scale video frames. This task is extremely challenging due to the serious domain discrepancy between images and video frames caused by appearance variance and motion blur. Previous methods perform both image-level and instance-level alignments to reduce the domain discrepancy, but the existing false instance alignments may limit their performance in real scenarios. We propose a novel spatial-temporal graph to model the contextual relationships between instances to alleviate the false alignments. Through message propagation over the graph, the visual information from the spatial and temporal neighboring object proposals are adaptively aggregated to enhance the current instance representation. Moreover, to adapt the source-biased decision boundary to the target data, we generate an intermediate domain between images and frames. It is worth mentioning that our method can be easily applied as a plug-and-play component to other image-to-video object detection models based on the instance alignment. Experiments on several datasets demonstrate the effectiveness of our method. Code will be available at: https://github.com/wenzihan/STMP.
引用
收藏
页码:360 / 375
页数:16
相关论文
共 50 条
  • [41] Learning Image and Video Compression through Spatial-Temporal Energy Compaction
    Cheng, Zhengxue
    Sun, Heming
    Takeuchi, Masaru
    Katto, Jiro
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10063 - 10072
  • [42] Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
    Ding, Zihan
    Hui, Tianrui
    Huang, Junshi
    Wei, Xiaoming
    Han, Jizhong
    Liu, Si
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4954 - 4963
  • [43] End-to-End Video Instance Segmentation via Spatial-Temporal Graph Neural Networks
    Wang, Tao
    Xu, Ning
    Chen, Kean
    Lin, Weiyao
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10777 - 10786
  • [44] Exploring Rich and Efficient Spatial Temporal Interactions for Real-Time Video Salient Object Detection
    Chen, Chenglizhao
    Wang, Guotao
    Peng, Chong
    Fang, Yuming
    Zhang, Dingwen
    Qin, Hong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3995 - 4007
  • [45] Spatial-Temporal Cascade Autoencoder for Video Anomaly Detection in Crowded Scenes
    Li, Nanjun
    Chang, Faliang
    Liu, Chunsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 203 - 215
  • [46] Action change detection in video using a bilateral spatial-temporal constraint
    Tian, Jing
    Chen, Li
    INTERNATIONAL JOURNAL OF ELECTRONICS, 2016, 103 (08) : 1279 - 1286
  • [47] Spatial-temporal algorithm for moving objects detection in infrared video sequences
    Pokrajac, D
    Zeljkovic, V
    Latecki, LJ
    Telsiks 2005, Proceedings, Vols 1 and 2, 2005, : 177 - 180
  • [48] Learning Graph Enhanced Spatial-Temporal Coherence for Video Anomaly Detection
    Cheng, Kai
    Liu, Yang
    Zeng, Xinhua
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 314 - 318
  • [49] Spatial-Temporal Salient Unit Detection Based on Features in Video Sequences
    Liu, Suolan
    Yang, Wankou
    Wang, Hongyuan
    Sun, Changyin
    2013 32ND CHINESE CONTROL CONFERENCE (CCC), 2013, : 3556 - 3559
  • [50] Video Inpainting in Spatial-Temporal Domain Based on Adaptive Background and Color Variance
    Huang, Hui-Yu
    Lin, Chih-Hung
    TRENDS IN APPLIED KNOWLEDGE-BASED SYSTEMS AND DATA SCIENCE, 2016, 9799 : 633 - 644