Zero-Shot Text-Guided Object Generation with Dream Fields

被引：180

作者：

Jain, Ajay ^{[1
,2
]}

Mildenhall, Ben ^{[2
]}

Barron, Jonathan T. ^{[2
]}

Abbeel, Pieter ^{[1
]}

Poole, Ben ^{[2
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Google Res, Mountain View, CA 94043 USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00094

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We combine neural rendering with multi-modal image and text representations to synthesize diverse 3D objects solely from natural language descriptions. Our method, Dream Fields, can generate the geometry and color of a wide range of objects without 3D supervision. Due to the scarcity of diverse, captioned 3D data, prior methods only generate objects from a handful of categories, such as ShapeNet. Instead, we guide generation with image-text models pre-trained on large datasets of captioned images from the web. Our method optimizes a Neural Radiance Field from many camera views so that rendered images score highly with a target caption according to a pre-trained CLIP model. To improve fidelity and visual quality, we introduce simple geometric priors, including sparsity-inducing transmittance regularization, scene bounds, and new MLP architectures. In experiments, Dream Fields produce realistic, multi-view consistent object geometry and color from a variety of natural language captions.

引用

页码：857 / 866

页数：10

共 50 条

[21] Example-guided stylized response generation in zero-shot setting
Guirong Bai
Shizhu He
Kang Liu
Jun Zhao
Science China Information Sciences, 2022, 65
[22] Retrieval Augmented Zero-Shot Text Classification
Abdullahi, Tassallah
Singh, Ritambhara
Eickhoff, Carsten
PROCEEDINGS OF THE 2024 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2024, 2024, : 195 - 203
[23] Text2Light. Zero-Shot Text-Driven HDR Panorama Generation
Chen, Zhaoxi
Wang, Guangcong
Liu, Ziwei
ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (06):
[24] Zero-Shot Object Detection for Indoor Robots
Abdalwhab, Abdalwhab
Liu, Huaping
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[25] Zero-Shot Object Detection with Textual Descriptions
Li, Zhihui
Yao, Lina
Zhang, Xiaoqin
Wang, Xianzhi
Kanhere, Salil
Zhang, Huaxiang
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8690 - 8697
[26] CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation
Sanghi, Aditya
Chu, Hang
Lambourne, Joseph G.
Wang, Ye
Cheng, Chin-Yi
Fumero, Marco
Malekshan, Kamal Rahimi
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18582 - 18592
[27] Zero-Shot Object Goal Visual Navigation
Zhao, Qianfan
Zhang, Lu
He, Bin
Qiao, Hong
Liu, Zhiyong
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2025 - 2031
[28] Transductive Learning for Zero-Shot Object Detection
Rahman, Shafin
Khan, Salman
Barnes, Nick
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6081 - 6090
[29] Zero-Shot Object Counting with Good Exemplars
Zhu, Huilin
Yuan, Jingling
Yang, Zhengwei
Guo, Yu
Wang, Zheng
Zhong, Xian
He, Shengfeng
COMPUTER VISION - ECCV 2024, PT V, 2025, 15063 : 368 - 385
[30] ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Tewel, Yoad
Shalev, Yoav
Schwartz, Idan
Wolf, Lior
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17897 - 17907

← 1 2 3 4 5 →