Temporal Transductive Inference for Few-Shot Video Object Segmentation

被引:0
|
作者
Siam, Mennatullah [1 ]
机构
[1] Univ British Columbia, Comp Sci, Vancouver, BC, Canada
关键词
Few-shot learning; Transductive inference; Video object segmentation;
D O I
10.1007/s11263-025-02390-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot video object segmentation (FS-VOS) aims at segmenting video frames using a few labelled examples of classes not seen during initial training. In this paper, we present a simple but effective temporal transductive inference (TTI) approach that leverages temporal consistency in the unlabelled video frames during few-shot inference without episodic training. Key to our approach is the use of a video-level temporal constraint that augments frame-level constraints. The objective of the video-level constraint is to learn consistent linear classifiers for novel classes across the image sequence. It acts as a spatiotemporal regularizer during the transductive inference to increase temporal coherence and reduce overfitting on the few-shot support set. Empirically, our approach outperforms state-of-the-art meta-learning approaches in terms of mean intersection over union on YouTube-VIS by 2.5%. In addition, we introduce an improved benchmark dataset that is exhaustively labelled (i.e., all object occurrences are labelled, unlike the currently available). Our empirical results and temporal consistency analysis confirm the added benefits of the proposed spatiotemporal regularizer to improve temporal coherence. Our code and benchmark dataset is publicly available at, https://github.com/MSiam/tti_fsvos/.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Transductive meta-learning with enhanced feature ensemble for few-shot semantic segmentation
    Karimi, Amin
    Poullis, Charalambos
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [32] Few-shot object segmentation with a new feature aggregation module
    Liu, Kaijun
    Lyu, Shujing
    Shivakumara, Palaiahnakote
    Lu, Yue
    DISPLAYS, 2023, 78
  • [33] Unseen Object Few-Shot Semantic Segmentation for Robotic Grasping
    Liu, Xiaozheng
    Zhang, Yunzhou
    Shan, Dexing
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (01) : 320 - 327
  • [34] Transductive meta-learning with enhanced feature ensemble for few-shot semantic segmentation
    Amin Karimi
    Charalambos Poullis
    Scientific Reports, 14
  • [35] When Few-Shot Learning Meets Video Object Detection
    Yu, Zhongjie
    Wang, Gaoang
    Chen, Lin
    Raschka, Sebastian
    Luo, Jiebo
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2986 - 2992
  • [36] Learning Implicit Temporal Alignment for Few-shot Video Classification
    Zhang, Songyang
    Zhou, Jiale
    He, Xuming
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1309 - 1315
  • [37] A Transductive Approach for Video Object Segmentation
    Zhang, Yizhuo
    Wu, Zhirong
    Peng, Houwen
    Lin, Stephen
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6947 - 6956
  • [38] Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation
    Gao, Bin-Bin
    Chen, Xiaochen
    Huang, Zhongyi
    Nie, Congchong
    Liu, Jun
    Lai, Jinxiang
    Jiang, Guannan
    Wang, Xi
    Wang, Chengjie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [39] Few-shot human-object interaction video recognition with transformers
    Li, Qiyue
    Xie, Xuemei
    Zhang, Jin
    Shi, Guangming
    NEURAL NETWORKS, 2023, 163 : 1 - 9
  • [40] Relation fusion propagation network for transductive few-shot learning
    Huang, Yixiang
    Hao, Hongyu
    Ge, Weichao
    Cao, Yang
    Wu, Ming
    Zhang, Chuang
    Guo, Jun
    PATTERN RECOGNITION, 2024, 151