Indoor and Outdoor 3D Scene Graph Generation Via Language-Enabled Spatial Ontologies

被引：1

作者：

Strader, Jared ^{[1
]}

Hughes, Nathan ^{[1
]}

Chen, William ^{[2
]}

Speranzon, Alberto ^{[3
]}

Carlone, Luca ^{[1
]}

机构：

[1] MIT, Lab Informat & Decis Syst LIDS, Cambridge, MA 02139 USA

[2] Univ Calif Berkeley, Berkeley Artificial Intelligence Res BAIR, Berkeley, CA 94720 USA

[3] Lockheed Martin, Adv Technol Labs, Eagan, MN 55121 USA

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 06期

关键词：

AI-based methods; 3D scene graphs; semantic scene understanding; spatial ontologies;

D O I：

10.1109/LRA.2024.3384084

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

This letter proposes an approach to build 3D scene graphs in arbitrary indoor and outdoor environments. Such extension is challenging; the hierarchy of concepts that describe an outdoor environment is more complex than for indoors, and manually defining such hierarchy is time-consuming and does not scale. Furthermore, the lack of training data prevents the straightforward application of learning-based tools used in indoor settings. To address these challenges, we propose two novel extensions. First, we develop methods to build a spatial ontology defining concepts and relations relevant for indoor and outdoor robot operation. In particular, we use a Large Language Model (LLM) to build such an ontology, thus largely reducing the amount of manual effort required. Second, we leverage the spatial ontology for 3D scene graph construction using Logic Tensor Networks (LTN) to add logical rules, or axioms (e.g., "a beach contains sand"), which provide additional supervisory signals at training time thus reducing the need for labelled data, providing better predictions, and even allowing predicting concepts unseen at training time. We test our approach in a variety of datasets, including indoor, rural, and coastal environments, and show that it leads to a significant increase in the quality of the 3D scene graph generation with sparsely annotated data.

引用

页码：4886 / 4893

页数：8

共 50 条

[31] 3D Spatial Multimodal Knowledge Accumulation for Scene Graph Prediction in Point Cloud
Feng, Mingtao
Hou, Haoran
Zhang, Liang
Wu, Zijie
Guo, Yulan
Mian, Ajmal
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9182 - 9191
[32] Indoor-Outdoor 3D Reconstruction Alignment
Cohen, Andrea
Schonberger, Johannes L.
Speciale, Pablo
Sattler, Torsten
Frahm, Jan-Michael
Pollefeys, Marc
COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 : 285 - 300
[33] A system of configurable 3D indoor scene synthesis via semantic relation learning
Yang, Xinyan
Hu, Fei
Ye, Long
Chang, Zhiming
Li, Jiyin
DISPLAYS, 2022, 74
[34] Generation of 3D molecules in pockets via a language model
Wei Feng
Lvwei Wang
Zaiyun Lin
Yanhao Zhu
Han Wang
Jianqiang Dong
Rong Bai
Huting Wang
Jielong Zhou
Wei Peng
Bo Huang
Wenbiao Zhou
Nature Machine Intelligence, 2024, 6 : 62 - 73
[35] Generation of 3D molecules in pockets via a language model
Feng, Wei
Wang, Lvwei
Lin, Zaiyun
Zhu, Yanhao
Wang, Han
Dong, Jianqiang
Bai, Rong
Wang, Huting
Zhou, Jielong
Peng, Wei
Huang, Bo
Zhou, Wenbiao
NATURE MACHINE INTELLIGENCE, 2024, 6 (01) : 62 - 73
[36] Automatic 3D object placement for 3D scene generation
Akazawa, Y
Okada, Y
Niijima, K
MODELLING AND SIMULATION 2003, 2003, : 316 - 318
[37] Generating Visual Spatial Description via Holistic 3D Scene Understanding
Zhao, Yu
Fei, Hao
Ji, Wei
Wei, Jianguo
Zhang, Meishan
Zhang, Min
Chua, Tat-Seng
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7960 - 7977
[38] Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
Wald, Johanna
Dhamo, Helisa
Navab, Nassir
Tombari, Federico
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3960 - 3969
[39] Stylistic scene enhancement GAN: mixed stylistic enhancement generation for 3D indoor scenes
Suiyun Zhang
Zhizhong Han
Yu-Kun Lai
Matthias Zwicker
Hui Zhang
The Visual Computer, 2019, 35 : 1157 - 1169
[40] DiffInDScene: Diffusion-based High-Quality 3D Indoor Scene Generation
Ju, Xiaoliang
Huang, Zhaoyang
Li, Yijin
Zhang, Guofeng
Qiao, Yu
Li, Hongsheng
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 4526 - 4535

← 1 2 3 4 5 →