Granular3D: Delving into multi-granularity 3D scene graph prediction

被引:0
|
作者
Huang, Kaixiang [1 ,2 ]
Yang, Jingru [1 ,2 ]
Wang, Jin [1 ,2 ,6 ]
He, Shengfeng [3 ]
Wang, Zhan [4 ]
He, Haiyan [1 ,2 ,5 ]
Zhang, Qifeng
Lu, Guodong [1 ,2 ]
机构
[1] Zhejiang Univ, State Key Lab Fluid Power & Mechatron Syst, Hangzhou 310027, Zhejiang, Peoples R China
[2] Zhejiang Univ, Robot Inst, Hangzhou 310027, Zhejiang, Peoples R China
[3] Singapore Management Univ, Singapore 178903, Singapore
[4] Zhejiang Energy Digital Technol Co Ltd, Dept Artificial Intelligence & Robot, Hangzhou 310027, Zhejiang, Peoples R China
[5] Zhejiang Baima Lake Lab Co Ltd, Hangzhou 310000, Zhejiang, Peoples R China
[6] Jinhua Key Lab Robot Intelligent Welding Technol, Jinhua 321000, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
3D point cloud; 3D semantic scene graph prediction; Multi-granularity; Gather point transformer; LANGUAGE;
D O I
10.1016/j.patcog.2024.110562
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the significant challenges in 3D Semantic Scene Graph (3DSSG) prediction, essential for understanding complex 3D environments. Traditional approaches, primarily using PointNet and Graph Convolutional Networks, struggle with effectively extracting multi -grained features from intricate 3D scenes, largely due to a focus on global scene processing and single -scale feature extraction. To overcome these limitations, we introduce Granular3D, a novel approach that shifts the focus towards multi -granularity analysis by predicting relation triplets from specific sub -scenes. One key is the Adaptive Instance Enveloping Method (AIEM), which establishes an approximate envelope structure around irregular instances, providing shape -adaptive local point cloud sampling, thereby comprehensively covering the contextual environments of instances. Moreover, Granular3D incorporates a Hierarchical Dual -Stage Network (HDSN), which differentiates and processes features of instances and their pairs at varying scales, leading to a targeted prediction of instance categories and their relationships. To advance the perception of sub -scene in HDSN, we design a Gather Point Transformer structure (GaPT) that enables the combinatorial interaction of local information from multiple point cloud sets, achieving a more comprehensive local contextual feature extraction. Extensive evaluations on the challenging 3DSSG benchmark demonstrate that our methods provide substantial improvements, establishing a new state-of-the-art in 3DSSG prediction, boosting the top -50 triplet accuracy by + 2.8%.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Multi-weighted graph 3D convolution network for traffic prediction
    Yuqing Liu
    Chen Wang
    Sixuan Xu
    Wei Zhou
    Yuzhi Chen
    Neural Computing and Applications, 2023, 35 : 15221 - 15237
  • [32] Research on 3D Reconstruction Optimization Method for High-speed Flying Object Based on Multi-granularity Matching
    Fan, Longtao
    Yang, Tingting
    Zhang, Sen
    2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 436 - 440
  • [33] A Hybrid Multi-View 3D Reconstruction Method Based on Scene Graph Partition
    Xue J.-S.
    Yi H.
    Wu Z.-H.
    Chen X.-N.
    Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (04): : 782 - 795
  • [34] Efficient Structured Prediction for 3D Indoor Scene Understanding
    Schwing, Alexander G.
    Hazan, Tamir
    Pollefeys, Marc
    Urtasun, Raquel
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2815 - 2822
  • [35] Automatic 3D object placement for 3D scene generation
    Akazawa, Y
    Okada, Y
    Niijima, K
    MODELLING AND SIMULATION 2003, 2003, : 316 - 318
  • [36] 3D scene graph representation and application for intelligent indoor spaces
    Tang, Shengjun
    Du, Siqi
    Wang, Weixi
    Guo, Renzhong
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2024, 53 (07): : 1355 - 1370
  • [37] Scene Graph Organization and Rendering in 3D Substation Simulation System
    Yan Guangwei
    Guan Zhitao
    2009 ASIA-PACIFIC POWER AND ENERGY ENGINEERING CONFERENCE (APPEEC), VOLS 1-7, 2009, : 2946 - 2949
  • [38] Dynamic 3d scene graph generation for robotic manipulation tasks
    Jung G.Y.
    Kim I.
    Journal of Institute of Control, Robotics and Systems, 2021, 27 (12) : 953 - 963
  • [39] Text-enhanced Multi-Granularity Temporal Graph Learning for Event Prediction
    Han, Xiaoxue
    Ning, Yue
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 171 - 180
  • [40] Exploring Hierarchical Spatial Layout Cues for 3D Point Cloud Based Scene Graph Prediction
    Feng, Mingtao
    Hou, Haoran
    Zhang, Liang
    Guo, Yulan
    Yu, Hongshan
    Wang, Yaonan
    Mian, Ajmal
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 731 - 743