DiffI2I: Efficient Diffusion Model for Image-to-Image Translation

被引:0
|
作者
Xia, Bin [1 ]
Zhang, Yulun [2 ]
Wang, Shiyin [3 ]
Wang, Yitong [3 ]
Wu, Xinglong [3 ]
Tian, Yapeng [4 ]
Yang, Wenming [1 ]
Timotfe, Radu [5 ,6 ]
Van Gool, Luc [2 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
[2] Swiss Fed Inst Technol, Comp Vis Lab, CH-8092 Zurich, Switzerland
[3] Bytedance Inc, Shenzhen 518055, Peoples R China
[4] Univ Texas Dallas, Dept Comp Sci, Richardson, TX 75080 USA
[5] Univ Wurzburg, Comp Vis Lab, IFI, D-97070 Wurzburg, Germany
[6] Univ Wurzburg, CAIDAS, D-97070 Wurzburg, Germany
基金
中国国家自然科学基金;
关键词
Intellectual property; Noise reduction; Runtime; Image synthesis; Transformers; Semantic segmentation; Image restoration; Diffusion processes; Dense prediction; diffusion model; image restoration; image-to-image translation; inpainting; motion deblurring; super-resolution; SUPERRESOLUTION; RESTORATION; NETWORK;
D O I
10.1109/TPAMI.2024.3498003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Diffusion Model (DM) has emerged as the SOTA approach for image synthesis. However, the existing DM cannot perform well on some image-to-image translation (I2I) tasks. Different from image synthesis, some I2I tasks, such as super-resolution, require generating results in accordance with GT images. Traditional DMs for image synthesis require extensive iterations and large denoising models to estimate entire images, which gives their strong generative ability but also leads to artifacts and inefficiency for I2I. To tackle this challenge, we propose a simple, efficient, and powerful DM framework for I2I, called DiffI2I. Specifically, DiffI2I comprises three key components: a compact I2I prior extraction network (CPEN), a dynamic I2I transformer (DI2Iformer), and a denoising network. We train DiffI2I in two stages: pretraining and DM training. For pretraining, GT and input images are fed into CPEN(S1 )to capture a compact I2I prior representation (IPR) guiding DI2Iformer. In the second stage, the DM is trained to only use the input images to estimate the same IRP as CPENS1. Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations. Through extensive experiments on various I2I tasks, we demonstrate that DiffI2I achieves SOTA performance while significantly reducing computational burdens.
引用
收藏
页码:1578 / 1593
页数:16
相关论文
共 50 条
  • [21] Multimodal Unsupervised Image-to-Image Translation
    Huang, Xun
    Liu, Ming-Yu
    Belongie, Serge
    Kautz, Jan
    COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 179 - 196
  • [22] Domain Adaptive Image-to-image Translation
    Chen, Ying-Cong
    Xu, Xiaogang
    Jia, Jiaya
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5273 - 5282
  • [23] Unsupervised Image-to-Image Translation: A Review
    Hoyez, Henri
    Schockaert, Cedric
    Rambach, Jason
    Mirbach, Bruno
    Stricker, Didier
    SENSORS, 2022, 22 (21)
  • [24] Unsupervised Image-to-Image Translation Networks
    Liu, Ming-Yu
    Breuel, Thomas
    Kautz, Jan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [25] A novel framework for image-to-image translation and image compression
    Yang, Fei
    Wang, Yaxing
    Herranz, Luis
    Cheng, Yongmei
    Mozerov, Mikhail G.
    NEUROCOMPUTING, 2022, 508 : 58 - 70
  • [26] Guided Image Weathering using Image-to-Image Translation
    Chen, Yu
    Shen, I-Chao
    Chen, Bing-Yu
    PROCEEDINGS OF SIGGRAPH ASIA 2021 TECHNICAL COMMUNICATIONS, 2021,
  • [27] Correction to: Generative image completion with image-to-image translation
    Shuzhen Xu
    Qing Zhu
    Jin Wang
    Neural Computing and Applications, 2020, 32 : 17809 - 17809
  • [28] TransferI2I: Transfer Learning for Image-to-Image Translation from Small Datasets
    Wang, Yaxing
    Laria, Hector
    van de Weijer, Joost
    Lopez-Fuentes, Laura
    Raducanu, Bogdan
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13990 - 13999
  • [29] Multidomain image-to-image translation model based on hidden space sharing
    Ding Yuxin
    Wang Longfei
    Neural Computing and Applications, 2022, 34 : 283 - 298
  • [30] Unsupervised Image-to-Image Translation with Generative Prior
    Yang, Shuai
    Jiang, Liming
    Liu, Ziwei
    Loy, Chen Change
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18311 - 18320