ShapeScaffolder: Structure-Aware 3D Shape Generation from Text

被引:3
|
作者
Tian, Xi [1 ]
Yang, Yong-Liang [1 ]
Wu, Qi [2 ]
机构
[1] Univ Bath, Bath, Avon, England
[2] Univ Adelaide, Adelaide, SA, Australia
关键词
D O I
10.1109/ICCV51070.2023.00256
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present ShapeScaffolder, a structure-based neural network for generating colored 3D shapes based on text input. The approach, similar to providing scaffolds as internal structural supports and adding more details to them, aims to capture finer text-shape connections and improve the quality of generated shapes. Traditional text-to- shape methods often generate 3D shapes as a whole. However, humans tend to understand both shape and text as being structure- based. For example, a table is interpreted as being composed of legs, a seat, and a back; similarly, texts possess inherent linguistic structures that can be analyzed as dependency graphs, depicting the relationships between entities within the text. We believe structure-aware shape generation can bring finer text-shape connections and improve shape generation quality. However, the lack of explicit shape structure and the high freedom of text structure make cross-modality learning challenging. To address these challenges, we first build the structured shape implicit fields in an unsupervised manner. We then propose the part-level attention mechanism between shape parts and textual graph nodes to align the two modalities at the structural level. Finally, we employ a shape refiner to add further detail to the predicted structure, yielding the final results. Extensive experimentation demonstrates that our approaches outperform state-of-the-art methods in terms of both shape fidelity and shape-text matching. Our methods also allow for part-level manipulation and improved part-level completeness.
引用
收藏
页码:2715 / 2724
页数:10
相关论文
共 50 条
  • [31] 3D human pose estimation via human structure-aware fully connected network
    Zhang, Xiaoyan
    Tang, Zhenhua
    Hou, Junhui
    Hao, Yanbin
    PATTERN RECOGNITION LETTERS, 2019, 125 : 404 - 410
  • [32] 3D Instance Embedding Learning With a Structure-Aware Loss Function for Point Cloud Segmentation
    Liang, Zhidong
    Yang, Ming
    Li, Hao
    Wang, Chunxiang
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (03) : 4915 - 4922
  • [33] Patchwork Stereo: Scalable, Structure-aware 3D Reconstruction in Man-made Environments
    Bourki, Amine
    de La Gorce, Martin
    Marlet, Renaud
    Komodakis, Nikos
    2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, : 292 - 301
  • [34] TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
    Wei, Jiacheng
    Wang, Hao
    Feng, Jiashi
    Lin, Guosheng
    Yap, Kim-Hui
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16805 - 16815
  • [35] Parsing Geometry Using Structure-Aware Shape Templates
    Ganapathi-Subramanian, Vignesh
    Diamanti, Olga
    Pirk, Soeren
    Tang, Chengcheng
    Niessner, Matthias
    Guibas, Leonidas J.
    2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, : 672 - 681
  • [36] Text-to-3D Shape Generation
    Lee, H.
    Savva, M.
    Chang, A. X.
    COMPUTER GRAPHICS FORUM, 2024, 43 (02)
  • [37] Reconstruction of Dexterous 3D Motion Data From a Flexible Magnetic Sensor With Deep Learning and Structure-Aware Filtering
    Huang, Jiawei
    Sugawara, Ryo
    Chu, Kinfung
    Komura, Taku
    Kitamura, Yoshifumi
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (06) : 2400 - 2414
  • [38] A Structure-Aware Method for Cross-domain Text Classification
    Zhang, Yuhong
    Qian, Lin
    Zhang, Qi
    Li, Peipei
    Liu, Guocheng
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2022, 13630 : 283 - 296
  • [39] ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model
    Fu, Rao
    Zhan, Xiao
    Chen, Yiwen
    Ritchie, Daniel
    Sridhar, Srinath
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [40] Structure-Aware Long Short-Term Memory Network for 3D Cephalometric Landmark Detection
    Chen, Runnan
    Ma, Yuexin
    Chen, Nenglun
    Liu, Lingjie
    Cui, Zhiming
    Lin, Yanhong
    Wang, Wenping
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (07) : 1791 - 1801