Granular3D: Delving into multi-granularity 3D scene graph prediction

被引:0
|
作者
Huang, Kaixiang [1 ,2 ]
Yang, Jingru [1 ,2 ]
Wang, Jin [1 ,2 ,6 ]
He, Shengfeng [3 ]
Wang, Zhan [4 ]
He, Haiyan [1 ,2 ,5 ]
Zhang, Qifeng
Lu, Guodong [1 ,2 ]
机构
[1] Zhejiang Univ, State Key Lab Fluid Power & Mechatron Syst, Hangzhou 310027, Zhejiang, Peoples R China
[2] Zhejiang Univ, Robot Inst, Hangzhou 310027, Zhejiang, Peoples R China
[3] Singapore Management Univ, Singapore 178903, Singapore
[4] Zhejiang Energy Digital Technol Co Ltd, Dept Artificial Intelligence & Robot, Hangzhou 310027, Zhejiang, Peoples R China
[5] Zhejiang Baima Lake Lab Co Ltd, Hangzhou 310000, Zhejiang, Peoples R China
[6] Jinhua Key Lab Robot Intelligent Welding Technol, Jinhua 321000, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
3D point cloud; 3D semantic scene graph prediction; Multi-granularity; Gather point transformer; LANGUAGE;
D O I
10.1016/j.patcog.2024.110562
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the significant challenges in 3D Semantic Scene Graph (3DSSG) prediction, essential for understanding complex 3D environments. Traditional approaches, primarily using PointNet and Graph Convolutional Networks, struggle with effectively extracting multi -grained features from intricate 3D scenes, largely due to a focus on global scene processing and single -scale feature extraction. To overcome these limitations, we introduce Granular3D, a novel approach that shifts the focus towards multi -granularity analysis by predicting relation triplets from specific sub -scenes. One key is the Adaptive Instance Enveloping Method (AIEM), which establishes an approximate envelope structure around irregular instances, providing shape -adaptive local point cloud sampling, thereby comprehensively covering the contextual environments of instances. Moreover, Granular3D incorporates a Hierarchical Dual -Stage Network (HDSN), which differentiates and processes features of instances and their pairs at varying scales, leading to a targeted prediction of instance categories and their relationships. To advance the perception of sub -scene in HDSN, we design a Gather Point Transformer structure (GaPT) that enables the combinatorial interaction of local information from multiple point cloud sets, achieving a more comprehensive local contextual feature extraction. Extensive evaluations on the challenging 3DSSG benchmark demonstrate that our methods provide substantial improvements, establishing a new state-of-the-art in 3DSSG prediction, boosting the top -50 triplet accuracy by + 2.8%.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Multi-view PointNet for 3D Scene Understanding
    Jaritz, Maximilian
    Gu, Jiayuan
    Su, Hao
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3995 - 4003
  • [42] An improved 3D reconstruction method of multi view scene
    Kong Fengxiao
    Li Xujian
    THIRD INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION; NETWORK AND COMPUTER TECHNOLOGY (ECNCT 2021), 2022, 12167
  • [44] Lang3DSG: Language-based contrastive pre-training for 3D Scene Graph prediction
    Koch, Sebastian
    Hermosilla, Pedro
    Vaskevicius, Narunas
    Colosi, Mirco
    Ropinski, Timo
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1037 - 1047
  • [45] Resolving 3D Human Pose Ambiguities with 3D Scene Constraints
    Hassan, Mohamed
    Choutas, Vasileios
    Tzionas, Dimitrios
    Black, Michael J.
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2282 - 2292
  • [46] Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs
    Dhamo, Helisa
    Manhardt, Fabian
    Navab, Nassir
    Tombari, Federico
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16332 - 16341
  • [47] Holistic Pose Graph: Modeling Geometric Structure among Objects in a Scene using Graph Inference for 3D Object Prediction
    Xiao, Jiwei
    Wang, Ruiping
    Chen, Xilin
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 12697 - 12706
  • [48] 3D graph contrastive learning for molecular property prediction
    Moon, Kisung
    Im, Hyeon-Jin
    Kwon, Sunyoung
    BIOINFORMATICS, 2023, 39 (06)
  • [49] Indoor Scene Recognition in 3D
    Huang, Shengyu
    Usvyatsov, Mikhail
    Schindler, Konrad
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 8041 - 8048
  • [50] 3D crime scene reconstruction
    Buck, Ursula
    FORENSIC SCIENCE INTERNATIONAL, 2019, 304