MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

被引:0
|
作者
Tang, Shitao [1 ]
Zhang, Fuyang [1 ]
Chen, Jiacheng [1 ]
Wang, Peng [2 ]
Furukawa, Yasutaka [1 ]
机构
[1] Simon Fraser Univ, Burnaby, BC, Canada
[2] Bytedance, Beijing, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces MVDiffusion, a simple yet effective method for generating consistent multi-view images from text prompts given pixel-to-pixel correspondences (e.g., perspective crops from a panorama or multi-view images given depth maps and poses). Unlike prior methods that rely on iterative image warping and inpainting, MVDiffusion simultaneously generates all images with a global awareness, effectively addressing the prevalent error accumulation issue. At its core, MVDiffusion processes perspective images in parallel with a pre-trained text-to-image diffusion model, while integrating novel correspondence-aware attention layers to facilitate cross-view interactions. For panorama generation, while only trained with 10k panoramas, MVDiffusion is able to generate high-resolution photorealistic images for arbitrary texts or extrapolate one perspective image to a 360-degree view. For multi-view depth-to-image generation, MVDiffusion demonstrates state-of-the-art performance for texturing a scene mesh. The project page is at https://mvdiffusion.github.io/.
引用
收藏
页数:32
相关论文
共 50 条
  • [21] Shading-Aware Multi-view Stereo
    Langguth, Fabian
    Sunkavalli, Kalyan
    Hadap, Sunil
    Goesele, Michael
    COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 : 469 - 485
  • [22] Attention-Aware Multi-View Stereo
    Luo, Keyang
    Guan, Tao
    Ju, Lili
    Wang, Yuesong
    Chen, Zhuo
    Luo, Yawei
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 1587 - 1596
  • [23] Fairness-aware Multi-view Clustering
    Zheng, Lecheng
    Zhu, Yada
    He, Jingrui
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 856 - 864
  • [24] MULTI-VIEW IMAGE FEATURE CORRELATION GUIDED COST AGGREGATION FOR MULTI-VIEW STEREO
    Lai, Yawen
    Qiu, Ke
    Wang, Ronggang
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [25] Structure-aware multi-view image inpainting using dual consistency attention
    Xiang, Hongyue
    Min, Weidong
    Han, Qing
    Zha, Cheng
    Liu, Qian
    Zhu, Meng
    INFORMATION FUSION, 2024, 104
  • [26] Context-Aware Multi-View Summarization Network for Image-Text Matching
    Qu, Leigang
    Liu, Meng
    Cao, Da
    Nie, Liqiang
    Tian, Qi
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1047 - 1055
  • [27] Geometry-Aware Reference Synthesis for Multi-View Image Super-Resolution
    Cheng, Ri
    Sun, Yuqi
    Yan, Bo
    Tan, Weimin
    Ma, Chenxi
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 6083 - 6093
  • [28] Nested Multi-view Image Classification
    Ebrahimi, Abdolghani
    Stec, Alexander
    Klabjan, Diego
    Utke, Jean
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 5125 - 5131
  • [29] Semantic Communications for Multi-View Generation
    Wei, Hao
    Ni, Wanli
    Xu, Wenjun
    Jiang, Wenchao
    Niyato, Dusit
    Zhang, Ping
    IEEE COMMUNICATIONS LETTERS, 2024, 28 (06) : 1308 - 1312
  • [30] Light Field Image Restoration via Latent Diffusion and Multi-View Attention
    Zhang, Shansi
    Lam, Edmund Y.
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1094 - 1098