Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

被引：1

作者：

Wu, Yiqian ^{[1
]}

Xu, Hao ^{[1
]}

Tang, Xiangjun ^{[1
]}

Chen, Xien ^{[2
]}

Tang, Siyu ^{[3
]}

Zhang, Zhebin ^{[4
]}

Li, Chen ^{[4
]}

Jin, Xiaogang ^{[1
]}

机构：

[1] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou, Peoples R China

[2] Yale Univ, New Haven, CT USA

[3] Swiss Fed Inst Technol, Zurich, Switzerland

[4] OPPO US Res Ctr, Menlo Pk, CA USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 04期

基金：

中国国家自然科学基金;

关键词：

3D portrait generation; 3D-aware GANs; diffusion models;

D O I：

10.1145/3658162

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance. However, relying solely on geometry information introduces issues such as the Janus problem, over-saturation, and over-smoothing. We present Portrait3D, a novel neural rendering-based framework with a novel joint geometry-appearance prior to achieve text-to-3D-portrait generation that overcomes the aforementioned issues. To accomplish this, we train a 3D portrait generator, 3DPortraitGAN(sic), as a robust prior. This generator is capable of producing 360 degrees. canonical 3D portraits, serving as a starting point for the subsequent diffusion-based generation process. To mitigate the "grid-like" artifact caused by the high-frequency information in the featuremap-based 3D representation commonly used by most 3D-aware GANs, we integrate a novel pyramid tri-grid 3D representation into 3DPortraitGAN(sic). To generate 3D portraits from text, we first project a randomly generated image aligned with the given prompt into the pre-trained 3DPortraitGAN(sic) 's latent space. The resulting latent code is then used to synthesize a pyramid tri-grid. Beginning with the obtained pyramid tri-grid, we use score distillation sampling to distill the diffusion model's knowledge into the pyramid tri-grid. Following that, we utilize the diffusion model to refine the rendered images of the 3D portrait and then use these refined images as training data to further optimize the pyramid tri-grid, effectively eliminating issues with unrealistic color and unnatural artifacts. Our experimental results show that Portrait3D can produce realistic, high-quality, and canonical 3D portraits that align with the prompt.

引用

页数：12

共 50 条

[1] HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks
Chen, Zhuo
Xu, Xudong
Yan, Yichao
Pan, Ye
Zhu, Wenhan
Wu, Wayne
Dai, Bo
Yang, Xiaokang
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9997 - 10010
[2] DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors
Lei, Biwen
Yu, Kai
Feng, Mengyang
Cui, Miaomiao
Xie, Xuansong
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10487 - 10497
[3] Towards Implicit Text-Guided 3D Shape Generation
Liu, Zhengzhe
Wang, Yi
Qi, Xiaojuan
Fu, Chi-Wing
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17875 - 17885
[4] WordRobe: Text-Guided Generation of Textured 3D Garments
Srivastava, Astitva
Manu, Pranav
Raj, Amit
Jampani, Varun
Sharma, Avinash
COMPUTER VISION-ECCV 2024, PT I, 2025, 15059 : 458 - 475
[5] EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape Generation
Liu, Zhengzhe
Hu, Jingyu
Hui, Ka-Hei
Qi, Xiaojuan
Cohen-Or, Daniel
Fu, Chi-Wing
ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (06):
[6] TECA: Text-Guided Generation and Editing of Compositional 3D Avatars
Zhang, Hao
Feng, Yao
Kulits, Peter
Wen, Yandong
Thies, Justus
Black, Michael J.
2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1520 - 1530
[7] DREAMCRAFT: Text-Guided Generation of Functional 3D Environments in Minecraft
Earle, Sam
Kokkinos, Filippos
Nie, Yuhe
Togelius, Julian
Raileanu, Roberta
PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF DIGITAL GAMES, FDG 2024, 2024,
[8] Text-guided 3D Human Generation from 2D Collections
Fu, Tsu-Jui
Xiong, Wenhan
Nie, Yixin
Liu, Jingyu
Oguz, Barlas
Wang, William Yang
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 4508 - 4520
[9] Advances in text-guided 3D editing: a survey
Lu, Lihua
Li, Ruyang
Zhang, Xiaohui
Wei, Hui
Du, Guoguang
Wang, Binqiang
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (12)
[10] TEXTure: Text-Guided Texturing of 3D Shapes
Richardson, Elad
Metzer, Gal
Alaluf, Yuval
Giryes, Raja
Cohen-Or, Daniel
PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,

← 1 2 3 4 5 →