SupeRGB-D: Zero-Shot Instance Segmentation in Cluttered Indoor Environments

被引:6
|
作者
Oernek, Evin Pnar [1 ]
Krishnan, Aravindhan K. [2 ]
Gayaka, Shreekant [2 ]
Kuo, Cheng-Hao [2 ]
Sen, Arnie [2 ]
Navab, Nassir [1 ]
Tombari, Federico [3 ,4 ]
机构
[1] Tech Univ Munich, D-80805 Munich, Germany
[2] Amazon Inc, Sunnyvale, CA 94089 USA
[3] Google, CH-8002 Zurich, Switzerland
[4] Tech Univ Munich, Fac Comp Sci, D-85748 Munich, Germany
关键词
Image segmentation; Three-dimensional displays; Feature extraction; Object recognition; Training; Robots; Task analysis; RGB-D Perception; deep Learning for visual perception; object detection; segmentation and categorization;
D O I
10.1109/LRA.2023.3271527
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Object instance segmentation is a key challenge for indoor robots navigating cluttered environments with many small objects. Limitations in 3D sensing capabilities often make it difficult to detect every possible object. While deep learning approaches may be effective for this problem, manually annotating 3D data for supervised learning is time-consuming. In this work, we explore zero-shot instance segmentation (ZSIS) from RGB-D data to identify unseen objects in a semantic category-agnostic manner. We introduce a zero-shot split for Tabletop Objects Dataset (TOD-Z) to enable this study and present a method that uses annotated objects to learn the "objectness" of pixels and generalize to unseen object categories in cluttered indoor environments. Our method, SupeRGB-D, groups pixels into small patches based on geometric cues and learns to merge the patches in a deep agglomerative clustering fashion. SupeRGB-D outperforms existing baselines on unseen objects while achieving similar performance on seen objects. We further show competitive results on the real dataset OCID. With its lightweight design (0.4 MB memory requirement), our method is extremely suitable for mobile and robotic applications. Additional DINO features can increase the performance with a higher memory requirement.
引用
收藏
页码:3709 / 3716
页数:8
相关论文
共 50 条
  • [21] Webly-supervised zero-shot learning for artwork instance recognition
    Del Chiaro, Riccardo
    Bagdanov, Andrew D.
    Del Bimbo, Alberto
    PATTERN RECOGNITION LETTERS, 2019, 128 : 420 - 426
  • [22] Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds
    Michele, Bjorn
    Boulch, Alexandre
    Puy, Gilles
    Bucher, Maxime
    Marlet, Renaud
    2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 992 - 1002
  • [23] ZeroPose: CAD-Prompted Zero-Shot Object 6D Pose Estimation in Cluttered Scenes
    Chen, Jianqiu
    Zhou, Zikun
    Sun, Mingshan
    Zhao, Rui
    Wu, Liwei
    Bao, Tianpeng
    He, Zhenyu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1251 - 1264
  • [24] Segment Any Leaf 3D: A Zero-Shot 3D Leaf Instance Segmentation Method Based on Multi-View Images
    Wang, Yunlong
    Zhang, Zhiyong
    SENSORS, 2025, 25 (02)
  • [25] Zero-shot domain adaptation with enhanced consistency for semantic segmentation
    Yang, Jiming
    Da, Feipeng
    Hong, Ru
    Cai, Zeyu
    Gai, Shaoyan
    COMPUTERS & ELECTRICAL ENGINEERING, 2025, 123
  • [26] Feature Enhanced Projection Network for Zero-shot Semantic Segmentation
    Lu, Hongchao
    Fang, Longwei
    Lin, Matthieu
    Deng, Zhidong
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14011 - 14017
  • [27] Weakly supervised classification model for zero-shot semantic segmentation
    Shen, Fengli
    Wang, Zong-Hui
    Lu, Zhe-Ming
    ELECTRONICS LETTERS, 2020, 56 (23) : 1247 - 1249
  • [28] Bidirectional Mask Selection for Zero-Shot Referring Image Segmentation
    Li, Wenhui
    Pang, Chao
    Nie, Weizhi
    Tian, Hongshuo
    Liu, An-An
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 911 - 921
  • [29] TagCLIP: Improving Discrimination Ability of Zero-Shot Semantic Segmentation
    Li, Jingyao
    Chen, Pengguang
    Qian, Shengju
    Liu, Shu
    Jia, Jiaya
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 11287 - 11297
  • [30] Advancing zero-shot semantic segmentation through attribute correlations
    Zhang, Runtong
    Meng, Fanman
    Chen, Shuai
    Wu, Qingbo
    Xu, Linfeng
    Li, Hongliang
    NEUROCOMPUTING, 2024, 594