SupeRGB-D: Zero-Shot Instance Segmentation in Cluttered Indoor Environments

被引:6
|
作者
Oernek, Evin Pnar [1 ]
Krishnan, Aravindhan K. [2 ]
Gayaka, Shreekant [2 ]
Kuo, Cheng-Hao [2 ]
Sen, Arnie [2 ]
Navab, Nassir [1 ]
Tombari, Federico [3 ,4 ]
机构
[1] Tech Univ Munich, D-80805 Munich, Germany
[2] Amazon Inc, Sunnyvale, CA 94089 USA
[3] Google, CH-8002 Zurich, Switzerland
[4] Tech Univ Munich, Fac Comp Sci, D-85748 Munich, Germany
关键词
Image segmentation; Three-dimensional displays; Feature extraction; Object recognition; Training; Robots; Task analysis; RGB-D Perception; deep Learning for visual perception; object detection; segmentation and categorization;
D O I
10.1109/LRA.2023.3271527
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Object instance segmentation is a key challenge for indoor robots navigating cluttered environments with many small objects. Limitations in 3D sensing capabilities often make it difficult to detect every possible object. While deep learning approaches may be effective for this problem, manually annotating 3D data for supervised learning is time-consuming. In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner. We introduce a zero-shot split for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method that uses annotated objects to learn the "objectness" of pixels and generalize to unseen object categories in cluttered indoor environments. Our method, SupeRGB-D, groups pixels into small patches based on geometric cues and learns to merge the patches in a deep agglomerative clustering fashion. SupeRGB-D outperforms existing baselines on unseen objects while achieving similar performance on seen objects. We further show competitive results on the real dataset OCID. With its lightweight design (0.4 MB memory requirement), our method is extremely suitable for mobile and robotic applications. Additional DINO features can increase the performance with a higher memory requirement.
引用
收藏
页码:3709 / 3716
页数:8
相关论文
共 50 条
  • [1] Zero-Shot Instance Segmentation
    Zheng, Ye
    Wu, Jiahong
    Qin, Yongqiang
    Zhang, Faen
    Cui, Li
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 2593 - 2602
  • [2] ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments With Vision Foundation Models
    Zhang, Ying
    Yin, Maoliang
    Bi, Wenfu
    Yan, Haibao
    Bian, Shaohan
    Zhang, Cui-Hua
    Hua, Changchun
    IEEE TRANSACTIONS ON ROBOTICS, 2025, 41 : 1568 - 1580
  • [3] Zero-Shot Semantic Segmentation
    Bucher, Maxime
    Vu, Tuan-Hung
    Cord, Matthieu
    Perez, Patrick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [4] Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation
    He, Shuting
    Ding, Henghui
    Jiang, Wei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19498 - 19507
  • [5] Decoupling Zero-Shot Semantic Segmentation
    Ding, Jian
    Xue, Nan
    Xia, Gui-Song
    Dai, Dengxin
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11573 - 11582
  • [6] Weakly Supervised Few-Shot and Zero-Shot Semantic Segmentation with Mean Instance Aware Prompt Learning
    Pandey, Prashant
    Chasmai, Mustafa
    Natarajan, Monish
    Lall, Brejesh
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1 - 6
  • [7] Prioritized Semantic Learning for Zero-Shot Instance Navigation
    Sun, Xinyu
    Liu, Lizhao
    Zhi, Hongyan
    Qiu, Ronghe
    Liang, Junwei
    COMPUTER VISION - ECCV 2024, PT XII, 2025, 15070 : 161 - 178
  • [8] Learning Discriminative Instance Attribute for Zero-Shot Classification
    Wang, Lu
    Wu, Songsong
    Yu, Jun
    Jing, Xiao-Yuan
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), VOL 1, 2016, : 210 - 213
  • [9] Zero-Shot Object Detection for Indoor Robots
    Abdalwhab, Abdalwhab
    Liu, Huaping
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [10] SATR: Zero-Shot Semantic Segmentation of 3D Shapes
    Abdelreheem, Ahmed
    Skorokhodov, Ivan
    Ovsjanikov, Maks
    Wonka, Peter
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15120 - 15133