MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

被引:0
|
作者
Tang, Shitao [1 ]
Zhang, Fuyang [1 ]
Chen, Jiacheng [1 ]
Wang, Peng [2 ]
Furukawa, Yasutaka [1 ]
机构
[1] Simon Fraser Univ, Burnaby, BC, Canada
[2] Bytedance, Beijing, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces MVDiffusion, a simple yet effective method for generating consistent multi-view images from text prompts given pixel-to-pixel correspondences (e.g., perspective crops from a panorama or multi-view images given depth maps and poses). Unlike prior methods that rely on iterative image warping and inpainting, MVDiffusion simultaneously generates all images with a global awareness, effectively addressing the prevalent error accumulation issue. At its core, MVDiffusion processes perspective images in parallel with a pre-trained text-to-image diffusion model, while integrating novel correspondence-aware attention layers to facilitate cross-view interactions. For panorama generation, while only trained with 10k panoramas, MVDiffusion is able to generate high-resolution photorealistic images for arbitrary texts or extrapolate one perspective image to a 360-degree view. For multi-view depth-to-image generation, MVDiffusion demonstrates state-of-the-art performance for texturing a scene mesh. The project page is at https://mvdiffusion.github.io/.
引用
收藏
页数:32
相关论文
共 50 条
  • [31] Shading aware DSM generation from high resolution multi-view satellite images
    Hu, Zhihua
    Tao, Pengjie
    Long, Xiaoxiang
    Wang, Haiyan
    GEO-SPATIAL INFORMATION SCIENCE, 2024, 27 (02) : 398 - 407
  • [32] EDGE-GAN: EDGE CONDITIONED MULTI-VIEW FACE IMAGE GENERATION
    Zou, Heqing
    Ak, Kenan E.
    Kassim, Ashraf A.
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2401 - 2405
  • [33] MULTI-VIEW VEHICLE IMAGE GENERATION NETWORK FOR VEHICLE RE-IDENTIFICATION
    Xun, Yizhe
    Liu, Jia
    Islam, Sardar M. N.
    Chen, Yuanfang
    2024 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS 2024, 2024, : 517 - 522
  • [34] Holistic Multi-View Building Analysis in the Wild with Projection Pooling
    Wojna, Zbigniew
    Maziarz, Krzysztof
    Jocz, Lukasz
    Paluba, Robert
    Kozikowski, Robert
    Kokkinos, Iasonas
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2870 - 2878
  • [35] Automatic Medical Image Report Generation with Multi-view and Multi-modal Attention Mechanism
    Yang, Shaokang
    Niu, Jianwei
    Wu, Jiyan
    Liu, Xuefeng
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT III, 2020, 12454 : 687 - 699
  • [36] Multi-View Representation Learning via View-Aware Modulation
    Wang, Ren
    Sun, Haoliang
    Nie, Xiushan
    Lin, Yuxiu
    Xi, Xiaoming
    Yin, Yilong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3876 - 3886
  • [37] Multi-View Image Capture for Glasses Free Multi-View 3D Displays
    Gurbuz, Sabri
    Yano, Sumio
    Iwasawa, Shoichiro
    Ando, Hiroshi
    IDW'10: PROCEEDINGS OF THE 17TH INTERNATIONAL DISPLAY WORKSHOPS, VOLS 1-3, 2010, : 2091 - 2094
  • [38] PlaneStereo: Plane-aware Multi-view Stereo
    Guo, Haoyu
    Peng, Sida
    Shen, Ting
    Zhou, Xiaowei
    MACHINE INTELLIGENCE RESEARCH, 2024, 21 (06) : 1092 - 1102
  • [39] Uncertainty-Aware Multi-View Representation Learning
    Geng, Yu
    Han, Zongbo
    Zhang, Changqing
    Hu, Qinghua
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7545 - 7553
  • [40] Information-aware Multi-view Outlier Detection
    Lai, Jinrong
    Wang, Tong
    Chen, Chuan
    Zheng, Zibin
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (04)