Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation

被引:96
|
作者
Wang, Haochen [1 ]
Du, Xiaodan [1 ]
Li, Jiahao [1 ]
Yeh, Raymond A. [2 ]
Shakhnarovich, Greg [1 ]
机构
[1] TTI Chicago, Chicago, IL 60637 USA
[2] Purdue Univ, W Lafayette, IN 47907 USA
关键词
D O I
10.1109/CVPR52729.2023.01214
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A diffusion model learns to predict a vector field of gradients. We propose to apply chain rule on the learned gradients, and back-propagate the score of a diffusion model through the Jacobian of a differentiable renderer, which we instantiate to be a voxel radiance field. This setup aggregates 2D scores at multiple camera viewpoints into a 3D score, and re-purposes a pretrained 2D model for 3D data generation. We identify a technical challenge of distribution mismatch that arises in this application, and propose a novel estimation mechanism to resolve it. We run our algorithm on several off-the-shelf diffusion image generative models, including the recently released Stable Diffusion trained on the large-scale LAION 5B dataset.
引用
收藏
页码:12619 / 12629
页数:11
相关论文
共 50 条
  • [1] Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
    TTI-Chicago, United States
    不详
    Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, 1600, (12619-12629):
  • [2] GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models
    Yi, Taoran
    Fang, Jiemin
    Wang, Junjie
    Wu, Guanjun
    Xie, Lingxi
    Zhang, Xiaopeng
    Liu, Wenyu
    Tian, Qi
    Wang, Xinggang
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 6796 - 6807
  • [3] Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles
    Srivastava, Siddharth
    Jurie, Frederic
    Sharma, Gaurav
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 4504 - 4511
  • [4] 3D-aware Image Generation using 2D Diffusion Models
    Xiang, Jianfeng
    Yang, Jiaolong
    Huang, Binbin
    Tong, Xin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2383 - 2393
  • [5] Generation of 3D texture using multiple 2D models analysis
    Ghazanfarpour, D
    Dischler, JM
    COMPUTER GRAPHICS FORUM, 1996, 15 (03) : C311 - &
  • [6] Diffusion models for 3D generation: A survey
    Wang, Chen
    Peng, Hao-Yang
    Liu, Ying-Tian
    Gu, Jiatao
    Hu, Shi-Min
    COMPUTATIONAL VISUAL MEDIA, 2025, 11 (01): : 1 - 28
  • [7] Generation of 3D building models from 2D architectural plans
    Lewis, R
    Sequin, C
    COMPUTER-AIDED DESIGN, 1998, 30 (10) : 765 - 779
  • [8] Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models
    Xu, Chenfeng
    Yang, Shijia
    Galanti, Tomer
    Wu, Bichen
    Yue, Xiangyu
    Zhai, Bohan
    Zhan, Wei
    Vajda, Peter
    Keutzer, Kurt
    Tomizuka, Masayoshi
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 638 - 656
  • [9] Scalable 3D Captioning with Pretrained Models
    Luo, Tiange
    Rockwell, Chris
    Lee, Honglak
    Johnson, Justin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Lifting 2D StyleGAN for 3D-Aware Face Generation
    Shi, Yichun
    Aggarwal, Divyansh
    Jain, Anil K.
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6254 - 6262