Spatial constraint for efficient semi-supervised video object segmentation

被引：1

作者：

Chen, Yadang ^{[1
,2
]}

Ji, Chuanjun ^{[1
,2
]}

Yang, Zhi-Xin ^{[3
,4
]}

Wu, Enhua ^{[5
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Engn Res Ctr Digital Forens, Minist Educ, Nanjing 210044, Peoples R China

[2] Nanjing Univ Informat Sci & Technol, Sch Comp & Software, Nanjing 210044, Peoples R China

[3] Univ Macau, State Key Lab Internet Things Smart City, Macau 999078, Peoples R China

[4] Univ Macau, Dept Electromech Engn, Macau 999078, Peoples R China

[5] Univ Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100190, Peoples R China

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2023年 / 237卷

基金：

中国国家自然科学基金;

关键词：

Video object segmentation; Memory-based methods; Redundant information; Semantically similar objects;

D O I：

10.1016/j.cviu.2023.103843

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semi-supervised video object segmentation is the process of tracking and segmenting objects in a video sequence based on annotated masks for one or more frames. Recently, memory-based methods have attracted a significant amount of attention due to their strong performance. Having too much redundant information stored in memory, however, makes such methods inefficient and inaccurate. Moreover, a global matching strategy is usually used for memory reading, so these methods are susceptible to interference from semantically similar objects and are prone to incorrect segmentation. We propose a spatial constraint network to overcome these problems. In particular, we introduce a time-varying sensor and a dynamic feature memory to adaptively store pixel information to facilitate the modeling of the target object, which greatly reduces information redundancy in the memory without missing critical information. Furthermore, we propose an efficient memory reader that is less computationally intensive and has a smaller footprint. More importantly, we introduce a spatial constraint module to learn spatial consistency to obtain more precise segmentation; the target and distractors can be identified by the learned spatial response. The experimental results indicate that our method is competitive with state-of-the-art methods on several benchmark datasets. Our method also achieves an approximately 30 FPS inference speed, which is close to the requirement for real-time systems.

引用

页数：10

共 50 条

[21] Semi-supervised Learning for Segmentation Under Semantic Constraint
Ganaye, Pierre-Antoine
Sdika, Michael
Benoit-Cattin, Hugues
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, PT III, 2018, 11072 : 595 - 602
[22] Semi-supervised one-shot learning for video object segmentation in dynamic environments
Dinesh Elayaperumal
Sachin Sakthi K S
Jae Hoon Jeong
Young Hoon Joo
Multimedia Tools and Applications, 2025, 84 (6) : 3095 - 3115
[23] Semi-supervised Nuclei Segmentation Based on Consistency Regularization Constraint
Shu J.
Nian F.
Lü G.
Nian, Fudong (nianfd@hfuu.edu.cn), 1600, Science Press (33): : 643 - 652
[24] An efficient spatial semi-supervised learning algorithm
Vatsavai, Ranga Raju
Shekhar, Shashi
Burk, Thomas E.
INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2007, 22 (06) : 427 - 437
[25] Contextual Guided Segmentation Framework for Semi-supervised Video Instance Segmentation
Le, Trung-Nghia
Nguyen, Tam, V
Tran, Minh-Triet
MACHINE VISION AND APPLICATIONS, 2022, 33 (02)
[26] Contextual Guided Segmentation Framework for Semi-supervised Video Instance Segmentation
Trung-Nghia Le
Tam V. Nguyen
Minh-Triet Tran
Machine Vision and Applications, 2022, 33
[27] Semi-supervised Video Object Segmentation Via an Edge Attention Gated Graph Convolutional Network
Zhang, Yuqing
Zhang, Yong
Wang, Shaofan
Liang, Yun
Yin, Baocai
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (01)
[28] Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation
Park, Hyojin
Yoo, Jayeon
Jeong, Seohyeong
Venkatesh, Ganesh
Kwak, Nojun
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8401 - 8410
[29] Semi-Supervised Video Object Segmentation via Learning Object-Aware Global-Local Correspondence
Fan, Jiaqing
Liu, Bo
Zhang, Kaihua
Liu, Qingshan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8153 - 8164
[30] An efficient and scalable semi-supervised framework for semantic segmentation
Huazheng Hao
Hui Xiao
Junjie Xiong
Li Dong
Diqun Yan
Dongtai Liang
Jiayan Zhuang
Chengbin Peng
Neural Computing and Applications, 2025, 37 (7) : 5481 - 5497

← 1 2 3 4 5 →