ShapeScaffolder: Structure-Aware 3D Shape Generation from Text

被引：3

作者：

Tian, Xi ^{[1
]}

Yang, Yong-Liang ^{[1
]}

Wu, Qi ^{[2
]}

机构：

[1] Univ Bath, Bath, Avon, England

[2] Univ Adelaide, Adelaide, SA, Australia

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV | 2023年

关键词：

D O I：

10.1109/ICCV51070.2023.00256

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present ShapeScaffolder, a structure-based neural network for generating colored 3D shapes based on text input. The approach, similar to providing scaffolds as internal structural supports and adding more details to them, aims to capture finer text-shape connections and improve the quality of generated shapes. Traditional text-to- shape methods often generate 3D shapes as a whole. However, humans tend to understand both shape and text as being structure- based. For example, a table is interpreted as being composed of legs, a seat, and a back; similarly, texts possess inherent linguistic structures that can be analyzed as dependency graphs, depicting the relationships between entities within the text. We believe structure-aware shape generation can bring finer text-shape connections and improve shape generation quality. However, the lack of explicit shape structure and the high freedom of text structure make cross-modality learning challenging. To address these challenges, we first build the structured shape implicit fields in an unsupervised manner. We then propose the part-level attention mechanism between shape parts and textual graph nodes to align the two modalities at the structural level. Finally, we employ a shape refiner to add further detail to the predicted structure, yielding the final results. Extensive experimentation demonstrates that our approaches outperform state-of-the-art methods in terms of both shape fidelity and shape-text matching. Our methods also allow for part-level manipulation and improved part-level completeness.

引用

页码：2715 / 2724

页数：10

共 50 条

[41] Structure-Aware Flow Generation for Human Body Reshaping
Ren, Jianqiang
Yao, Yuan
Lei, Biwen
Cui, Miaomiao
Xie, Xuansong
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 7744 - 7753
[42] UniLG: A Unified Structure-aware Framework for Lyrics Generation
Qian, Tao
Lou, Fan
Shi, Jiatong
Wu, Yuning
Guo, Shuai
Yin, Xiang
Jin, Qin
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 983 - 1001
[43] SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation
Peng, Xidong
Zhu, Xinge
Wang, Tai
Ma, Yuexin
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 225 - 234
[44] Structure-aware 3D reconstruction for cable-stayed bridges: A learning-based method
Hu, Fangqiao
Zhao, Jin
Huang, Yong
Li, Hui
COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2021, 36 (01) : 89 - 108
[45] Compound 3D building modeling with structure-aware partition and primitive assembly from airborne laser scanning point clouds
Zang, Yufu
Mi, Wenhan
Xiao, Xiongwu
Guan, Haiyan
Chen, Jike
Li, Deren
INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2024, 17 (01)
[46] Automatic Generation of 3D Animations from Text and Images
Cannavo, Alberto
Gatteschi, Valentina
Macis, Luca
Lamberti, Fabrizio
EXTENDED REALITY, XR SALENTO 2022, PT I, 2022, 13445 : 77 - 91
[47] FAME: 3D Shape Generation via Functionality-Aware Model Evolution
Guan, Yanran
Liu, Han
Liu, Kun
Yin, Kangxue
Hu, Ruizhen
van Kaick, Oliver
Zhang, Yan
Yumer, Ersin
Carr, Nathan
Mech, Radomir
Zhang, Hao
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (04) : 1758 - 1772
[48] DreamStone: Image as a Stepping Stone for Text-Guided 3D Shape Generation
Liu, Zhengzhe
Dai, Peng
Li, Ruihui
Qi, Xiaojuan
Fu, Chi-Wing
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14385 - 14403
[49] Efficient Structure-Aware Selection Techniques for 3D Point Cloud Visualizations with 2DOF Input
Yu, Lingyun
Efstathiou, Konstantinos
Isenberg, Petra
Isenberg, Tobias
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2012, 18 (12) : 2245 - 2254
[50] Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
Zhao, Zibo
Liu, Wen
Chen, Xin
Zeng, Xianfang
Wang, Rui
Cheng, Pei
Fu, Bin
Chen, Tao
Yu, Gang
Gao, Shenghua
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →