Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

被引：0

作者：

Cheng, Ho Kei ^{[1
]}

Tai, Yu-Wing ^{[2
]}

Tang, Chi-Keung ^{[3
]}

机构：

[1] Univ Illinois, Urbana, IL 61801 USA

[2] Kuaishou Technol, Beijing, Peoples R China

[3] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a simple yet effective approach to modeling space-time correspondences in the context of video object segmentation. Unlike most existing approaches, we establish correspondences directly between frames without re-encoding the mask features for every object, leading to a highly efficient and robust framework. With the correspondences, every node in the current query frame is inferred by aggregating features from the past in an associative fashion. We cast the aggregation process as a voting problem and find that the existing inner-product affinity leads to poor use of memory with a small (fixed) subset of memory nodes dominating the votes, regardless of the query. In light of this phenomenon, we propose using the negative squared Euclidean distance instead to compute the affinities. We validate that every memory node now has a chance to contribute, and experimentally show that such diversified voting is beneficial to both memory efficiency and inference accuracy. The synergy of correspondence networks and diversified voting works exceedingly well, achieves new state-of-the-art results on both DAVIS and YouTubeVOS datasets while running significantly faster at 20+ FPS for multiple objects without bells and whistles.

引用

页数：14

共 50 条

[31] Video object detection via space-time feature aggregation and result reuse
Duan, Liang
Yang, Rongfei
Yue, Kun
Sun, Zhengbao
Yuan, Guowu
IET IMAGE PROCESSING, 2024, 18 (12) : 3356 - 3367
[32] Space-Time Slicing: Visualizing Object Detector Performance in Driving Video Sequences
Lee, Teng-Yok
Wittenburg, Kent
2019 IEEE PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS 2019), 2019, : 318 - 322
[33] Improved semantic video object segmentation algorithm
Ren, He
Hua, Chazhen
Jisuanji Gongcheng/Computer Engineering, 2002, 28 (08):
[34] Fast Real-Time Video Object Segmentation with a Tangled Memory Network
Mei, Jianbiao
Wang, Mengmeng
Yang, Yu
Li, Yanjun
Liu, Yong
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (03)
[35] Cooperative Space-Time Block Codes for Wireless Video Sensor Networks
Sousa, Marcelo Portela
Kumar, Ajey
Lopes, Rafael F.
Lopes, Waslon T. A.
de Alencar, Marcelo Sampaio
WIRELESS PERSONAL COMMUNICATIONS, 2012, 64 (01) : 123 - 137
[36] Cooperative Space-Time Block Codes for Wireless Video Sensor Networks
Marcelo Portela Sousa
Ajey Kumar
Rafael F. Lopes
Waslon T. A. Lopes
Marcelo Sampaio de Alencar
Wireless Personal Communications, 2012, 64 : 123 - 137
[37] Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels
Aytekin, Caglar
Ni, Xingyang
Cricri, Francesco
Fan, Lixin
Aksu, Emre
2018 IEEE 20TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2018,
[38] Variational space-time motion segmentation
Cremers, D
Soatto, S
NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, 2003, : 886 - 893
[39] Prototypical Matching Networks for Video Object Segmentation
Lin, Fanchao
Qiu, Zhaofan
Liu, Chuanbin
Yao, Ting
Xie, Hongtao
Zhang, Yongdong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5623 - 5636
[40] EFFICIENT COMPUTATION OF OPTIMAL SPACE-TIME PERFORMANCE CURVES FOR MEMORY HIERARCHIES
CHASTEK, GJ
KEARNS, JP
PERFORMANCE EVALUATION, 1985, 5 (04) : 215 - 223

← 1 2 3 4 5 →