Zero-Shot Text-Guided Object Generation with Dream Fields

被引:180
|
作者
Jain, Ajay [1 ,2 ]
Mildenhall, Ben [2 ]
Barron, Jonathan T. [2 ]
Abbeel, Pieter [1 ]
Poole, Ben [2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Google Res, Mountain View, CA 94043 USA
关键词
D O I
10.1109/CVPR52688.2022.00094
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We combine neural rendering with multi-modal image and text representations to synthesize diverse 3D objects solely from natural language descriptions. Our method, Dream Fields, can generate the geometry and color of a wide range of objects without 3D supervision. Due to the scarcity of diverse, captioned 3D data, prior methods only generate objects from a handful of categories, such as ShapeNet. Instead, we guide generation with image-text models pre-trained on large datasets of captioned images from the web. Our method optimizes a Neural Radiance Field from many camera views so that rendered images score highly with a target caption according to a pre-trained CLIP model. To improve fidelity and visual quality, we introduce simple geometric priors, including sparsity-inducing transmittance regularization, scene bounds, and new MLP architectures. In experiments, Dream Fields produce realistic, multi-view consistent object geometry and color from a variety of natural language captions.
引用
收藏
页码:857 / 866
页数:10
相关论文
共 50 条
  • [21] Example-guided stylized response generation in zero-shot setting
    Guirong Bai
    Shizhu He
    Kang Liu
    Jun Zhao
    Science China Information Sciences, 2022, 65
  • [22] Retrieval Augmented Zero-Shot Text Classification
    Abdullahi, Tassallah
    Singh, Ritambhara
    Eickhoff, Carsten
    PROCEEDINGS OF THE 2024 ACM SIGIR INTERNATIONAL CONFERENCE ON THE THEORY OF INFORMATION RETRIEVAL, ICTIR 2024, 2024, : 195 - 203
  • [23] Text2Light. Zero-Shot Text-Driven HDR Panorama Generation
    Chen, Zhaoxi
    Wang, Guangcong
    Liu, Ziwei
    ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (06):
  • [24] Zero-Shot Object Detection for Indoor Robots
    Abdalwhab, Abdalwhab
    Liu, Huaping
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [25] Zero-Shot Object Detection with Textual Descriptions
    Li, Zhihui
    Yao, Lina
    Zhang, Xiaoqin
    Wang, Xianzhi
    Kanhere, Salil
    Zhang, Huaxiang
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8690 - 8697
  • [26] CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation
    Sanghi, Aditya
    Chu, Hang
    Lambourne, Joseph G.
    Wang, Ye
    Cheng, Chin-Yi
    Fumero, Marco
    Malekshan, Kamal Rahimi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18582 - 18592
  • [27] Zero-Shot Object Goal Visual Navigation
    Zhao, Qianfan
    Zhang, Lu
    He, Bin
    Qiao, Hong
    Liu, Zhiyong
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2025 - 2031
  • [28] Transductive Learning for Zero-Shot Object Detection
    Rahman, Shafin
    Khan, Salman
    Barnes, Nick
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6081 - 6090
  • [29] Zero-Shot Object Counting with Good Exemplars
    Zhu, Huilin
    Yuan, Jingling
    Yang, Zhengwei
    Guo, Yu
    Wang, Zheng
    Zhong, Xian
    He, Shengfeng
    COMPUTER VISION - ECCV 2024, PT V, 2025, 15063 : 368 - 385
  • [30] ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
    Tewel, Yoad
    Shalev, Yoav
    Schwartz, Idan
    Wolf, Lior
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17897 - 17907