SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs

被引:0
|
作者
Miao, Yang [1 ]
Engelmann, Francis [1 ,2 ]
Vysotska, Olga [1 ]
Tombari, Federico [2 ,3 ]
Pollefeys, Marc [1 ,4 ]
Barath, Daniel Bela [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Google, Menlo Pk, CA USA
[3] Tech Univ Munich, Munich, Germany
[4] Microsoft, Redmond, WA USA
来源
关键词
Coarse Localization; 3D Scene Graph; Multi-modality; PLACE RECOGNITION;
D O I
10.1007/978-3-031-73242-3_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce the task of localizing an input image within a multi-modal reference map represented by a collection of 3D scene graphs. These scene graphs comprise multiple modalities, including object-level point clouds, images, attributes, and relationships between objects, offering a lightweight and efficient alternative to conventional methods that rely on extensive image databases. Given these modalities, the proposed method SceneGraphLoc learns a fixed-sized embedding for each node (i.e., representing object instances) in the scene graph, enabling effective matching with the objects visible in the input query image. This strategy significantly outperforms other cross-modal methods, even without incorporating images into the map representation. With images, SceneGraphLoc achieves performance close to that of state-of-the-art techniques depending on large image databases, while requiring three orders-of-magnitude less storage and operating orders-of-magnitude faster. Code and models are available at https://scenegraphloc.github.io.
引用
收藏
页码:127 / 150
页数:24
相关论文
共 50 条
  • [31] Cross-Modal Contrastive Learning for Domain Adaptation in 3D Semantic Segmentation
    Xing, Bowei
    Ying, Xianghua
    Wang, Ruibin
    Yang, Jinfa
    Chen, Taiyan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 2974 - 2982
  • [32] Cross-Modal 3D Object Detection and Tracking for Auto-Driving
    Zeng, Yihan
    Ma, Chao
    Zhu, Ming
    Fan, Zhiming
    Yang, Xiaokang
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 3850 - 3857
  • [33] Visual determinants of a cross-modal illusion
    James A. Armontrout
    Michael Schiutz
    Michael Kubovy
    Attention, Perception, & Psychophysics, 2009, 71 : 1618 - 1627
  • [34] Visual determinants of a cross-modal illusion
    Armontrout, James A.
    Schutz, Michael
    Kubovy, Michael
    ATTENTION PERCEPTION & PSYCHOPHYSICS, 2009, 71 (07) : 1618 - 1627
  • [35] Cross-modal orienting of visual attention
    Hillyard, Steven A.
    Stoermer, Viola S.
    Feng, Wenfeng
    Martinez, Antigona
    McDonald, John J.
    NEUROPSYCHOLOGIA, 2016, 83 : 170 - 178
  • [36] CROSS-MODAL CONGRUITY - VISUAL AND OLFACTORY
    HENION, KE
    JOURNAL OF SOCIAL PSYCHOLOGY, 1970, 81 (01): : 15 - &
  • [37] Cross-modal visual and vibrotactile tracking
    van Erp, JBF
    Verschoor, MH
    APPLIED ERGONOMICS, 2004, 35 (02) : 105 - 112
  • [38] Audio-Visual Event Localization based on Cross-Modal Interacting Guidance
    Yue, Qiurui
    Wu, Xiaoyu
    Gao, Jiayi
    2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE 2021), 2021, : 104 - 107
  • [39] 3D Scene Management Method Combined with Scene Graphs
    Wang, Xiang
    Shen, Tao
    Hu, Liang
    Guo, Congnan
    Gao, Su
    SENSORS AND MATERIALS, 2022, 34 (01) : 277 - 287
  • [40] 3D Scene Management Method Combined with Scene Graphs
    Wang X.
    Shen T.
    Huo L.
    Guo C.
    Gao S.
    Sensors and Materials, 2021, 34 (01) : 277 - 287