SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs

被引：0

作者：

Miao, Yang ^{[1
]}

Engelmann, Francis ^{[1
,2
]}

Vysotska, Olga ^{[1
]}

Tombari, Federico ^{[2
,3
]}

Pollefeys, Marc ^{[1
,4
]}

Barath, Daniel Bela ^{[1
]}

机构：

[1] Swiss Fed Inst Technol, Zurich, Switzerland

[2] Google, Menlo Pk, CA USA

[3] Tech Univ Munich, Munich, Germany

[4] Microsoft, Redmond, WA USA

来源：

COMPUTER VISION - ECCV 2024, PT VIII | 2025年 / 15066卷

关键词：

Coarse Localization; 3D Scene Graph; Multi-modality; PLACE RECOGNITION;

D O I：

10.1007/978-3-031-73242-3_8

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce the task of localizing an input image within a multi-modal reference map represented by a collection of 3D scene graphs. These scene graphs comprise multiple modalities, including object-level point clouds, images, attributes, and relationships between objects, offering a lightweight and efficient alternative to conventional methods that rely on extensive image databases. Given these modalities, the proposed method SceneGraphLoc learns a fixed-sized embedding for each node (i.e., representing object instances) in the scene graph, enabling effective matching with the objects visible in the input query image. This strategy significantly outperforms other cross-modal methods, even without incorporating images into the map representation. With images, SceneGraphLoc achieves performance close to that of state-of-the-art techniques depending on large image databases, while requiring three orders-of-magnitude less storage and operating orders-of-magnitude faster. Code and models are available at https://scenegraphloc.github.io.

引用

页码：127 / 150

页数：24

共 50 条

[31] Cross-Modal Contrastive Learning for Domain Adaptation in 3D Semantic Segmentation
Xing, Bowei
Ying, Xianghua
Wang, Ruibin
Yang, Jinfa
Chen, Taiyan
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 2974 - 2982
[32] Cross-Modal 3D Object Detection and Tracking for Auto-Driving
Zeng, Yihan
Ma, Chao
Zhu, Ming
Fan, Zhiming
Yang, Xiaokang
2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 3850 - 3857
[33] Visual determinants of a cross-modal illusion
James A. Armontrout
Michael Schiutz
Michael Kubovy
Attention, Perception, & Psychophysics, 2009, 71 : 1618 - 1627
[34] Visual determinants of a cross-modal illusion
Armontrout, James A.
Schutz, Michael
Kubovy, Michael
ATTENTION PERCEPTION & PSYCHOPHYSICS, 2009, 71 (07) : 1618 - 1627
[35] Cross-modal orienting of visual attention
Hillyard, Steven A.
Stoermer, Viola S.
Feng, Wenfeng
Martinez, Antigona
McDonald, John J.
NEUROPSYCHOLOGIA, 2016, 83 : 170 - 178
[36] CROSS-MODAL CONGRUITY - VISUAL AND OLFACTORY
HENION, KE
JOURNAL OF SOCIAL PSYCHOLOGY, 1970, 81 (01): : 15 - &
[37] Cross-modal visual and vibrotactile tracking
van Erp, JBF
Verschoor, MH
APPLIED ERGONOMICS, 2004, 35 (02) : 105 - 112
[38] Audio-Visual Event Localization based on Cross-Modal Interacting Guidance
Yue, Qiurui
Wu, Xiaoyu
Gao, Jiayi
2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE 2021), 2021, : 104 - 107
[39] 3D Scene Management Method Combined with Scene Graphs
Wang, Xiang
Shen, Tao
Hu, Liang
Guo, Congnan
Gao, Su
SENSORS AND MATERIALS, 2022, 34 (01) : 277 - 287
[40] 3D Scene Management Method Combined with Scene Graphs
Wang X.
Shen T.
Huo L.
Guo C.
Gao S.
Sensors and Materials, 2021, 34 (01) : 277 - 287

← 1 2 3 4 5 →