Indoor and Outdoor 3D Scene Graph Generation Via Language-Enabled Spatial Ontologies

被引:1
|
作者
Strader, Jared [1 ]
Hughes, Nathan [1 ]
Chen, William [2 ]
Speranzon, Alberto [3 ]
Carlone, Luca [1 ]
机构
[1] MIT, Lab Informat & Decis Syst LIDS, Cambridge, MA 02139 USA
[2] Univ Calif Berkeley, Berkeley Artificial Intelligence Res BAIR, Berkeley, CA 94720 USA
[3] Lockheed Martin, Adv Technol Labs, Eagan, MN 55121 USA
来源
关键词
AI-based methods; 3D scene graphs; semantic scene understanding; spatial ontologies;
D O I
10.1109/LRA.2024.3384084
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
This letter proposes an approach to build 3D scene graphs in arbitrary indoor and outdoor environments. Such extension is challenging; the hierarchy of concepts that describe an outdoor environment is more complex than for indoors, and manually defining such hierarchy is time-consuming and does not scale. Furthermore, the lack of training data prevents the straightforward application of learning-based tools used in indoor settings. To address these challenges, we propose two novel extensions. First, we develop methods to build a spatial ontology defining concepts and relations relevant for indoor and outdoor robot operation. In particular, we use a Large Language Model (LLM) to build such an ontology, thus largely reducing the amount of manual effort required. Second, we leverage the spatial ontology for 3D scene graph construction using Logic Tensor Networks (LTN) to add logical rules, or axioms (e.g., "a beach contains sand"), which provide additional supervisory signals at training time thus reducing the need for labelled data, providing better predictions, and even allowing predicting concepts unseen at training time. We test our approach in a variety of datasets, including indoor, rural, and coastal environments, and show that it leads to a significant increase in the quality of the 3D scene graph generation with sparsely annotated data.
引用
收藏
页码:4886 / 4893
页数:8
相关论文
共 50 条
  • [31] 3D Spatial Multimodal Knowledge Accumulation for Scene Graph Prediction in Point Cloud
    Feng, Mingtao
    Hou, Haoran
    Zhang, Liang
    Wu, Zijie
    Guo, Yulan
    Mian, Ajmal
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9182 - 9191
  • [32] Indoor-Outdoor 3D Reconstruction Alignment
    Cohen, Andrea
    Schonberger, Johannes L.
    Speciale, Pablo
    Sattler, Torsten
    Frahm, Jan-Michael
    Pollefeys, Marc
    COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 : 285 - 300
  • [33] A system of configurable 3D indoor scene synthesis via semantic relation learning
    Yang, Xinyan
    Hu, Fei
    Ye, Long
    Chang, Zhiming
    Li, Jiyin
    DISPLAYS, 2022, 74
  • [34] Generation of 3D molecules in pockets via a language model
    Wei Feng
    Lvwei Wang
    Zaiyun Lin
    Yanhao Zhu
    Han Wang
    Jianqiang Dong
    Rong Bai
    Huting Wang
    Jielong Zhou
    Wei Peng
    Bo Huang
    Wenbiao Zhou
    Nature Machine Intelligence, 2024, 6 : 62 - 73
  • [35] Generation of 3D molecules in pockets via a language model
    Feng, Wei
    Wang, Lvwei
    Lin, Zaiyun
    Zhu, Yanhao
    Wang, Han
    Dong, Jianqiang
    Bai, Rong
    Wang, Huting
    Zhou, Jielong
    Peng, Wei
    Huang, Bo
    Zhou, Wenbiao
    NATURE MACHINE INTELLIGENCE, 2024, 6 (01) : 62 - 73
  • [36] Automatic 3D object placement for 3D scene generation
    Akazawa, Y
    Okada, Y
    Niijima, K
    MODELLING AND SIMULATION 2003, 2003, : 316 - 318
  • [37] Generating Visual Spatial Description via Holistic 3D Scene Understanding
    Zhao, Yu
    Fei, Hao
    Ji, Wei
    Wei, Jianguo
    Zhang, Meishan
    Zhang, Min
    Chua, Tat-Seng
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7960 - 7977
  • [38] Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
    Wald, Johanna
    Dhamo, Helisa
    Navab, Nassir
    Tombari, Federico
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3960 - 3969
  • [39] Stylistic scene enhancement GAN: mixed stylistic enhancement generation for 3D indoor scenes
    Suiyun Zhang
    Zhizhong Han
    Yu-Kun Lai
    Matthias Zwicker
    Hui Zhang
    The Visual Computer, 2019, 35 : 1157 - 1169
  • [40] DiffInDScene: Diffusion-based High-Quality 3D Indoor Scene Generation
    Ju, Xiaoliang
    Huang, Zhaoyang
    Li, Yijin
    Zhang, Guofeng
    Qiao, Yu
    Li, Hongsheng
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 4526 - 4535