DiffI2I: Efficient Diffusion Model for Image-to-Image Translation

被引:0
|
作者
Xia, Bin [1 ]
Zhang, Yulun [2 ]
Wang, Shiyin [3 ]
Wang, Yitong [3 ]
Wu, Xinglong [3 ]
Tian, Yapeng [4 ]
Yang, Wenming [1 ]
Timotfe, Radu [5 ,6 ]
Van Gool, Luc [2 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
[2] Swiss Fed Inst Technol, Comp Vis Lab, CH-8092 Zurich, Switzerland
[3] Bytedance Inc, Shenzhen 518055, Peoples R China
[4] Univ Texas Dallas, Dept Comp Sci, Richardson, TX 75080 USA
[5] Univ Wurzburg, Comp Vis Lab, IFI, D-97070 Wurzburg, Germany
[6] Univ Wurzburg, CAIDAS, D-97070 Wurzburg, Germany
基金
中国国家自然科学基金;
关键词
Intellectual property; Noise reduction; Runtime; Image synthesis; Transformers; Semantic segmentation; Image restoration; Diffusion processes; Dense prediction; diffusion model; image restoration; image-to-image translation; inpainting; motion deblurring; super-resolution; SUPERRESOLUTION; RESTORATION; NETWORK;
D O I
10.1109/TPAMI.2024.3498003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Diffusion Model (DM) has emerged as the SOTA approach for image synthesis. However, the existing DM cannot perform well on some image-to-image translation (I2I) tasks. Different from image synthesis, some I2I tasks, such as super-resolution, require generating results in accordance with GT images. Traditional DMs for image synthesis require extensive iterations and large denoising models to estimate entire images, which gives their strong generative ability but also leads to artifacts and inefficiency for I2I. To tackle this challenge, we propose a simple, efficient, and powerful DM framework for I2I, called DiffI2I. Specifically, DiffI2I comprises three key components: a compact I2I prior extraction network (CPEN), a dynamic I2I transformer (DI2Iformer), and a denoising network. We train DiffI2I in two stages: pretraining and DM training. For pretraining, GT and input images are fed into CPEN(S1 )to capture a compact I2I prior representation (IPR) guiding DI2Iformer. In the second stage, the DM is trained to only use the input images to estimate the same IRP as CPENS1. Compared to traditional DMs, the compact IPR enables DiffI2I to obtain more accurate outcomes and employ a lighter denoising network and fewer iterations. Through extensive experiments on various I2I tasks, we demonstrate that DiffI2I achieves SOTA performance while significantly reducing computational burdens.
引用
收藏
页码:1578 / 1593
页数:16
相关论文
共 50 条
  • [1] A Diffusion Model Translator for Efficient Image-to-Image Translation
    Xia, Mengfei
    Zhou, Yu
    Yi, Ran
    Liu, Yong-Jin
    Wang, Wenping
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (12) : 10272 - 10283
  • [2] Unpaired Image-to-Image Translation with Diffusion Adversarial Network
    Tu, Hangyao
    Wang, Zheng
    Zhao, Yanwei
    MATHEMATICS, 2024, 12 (20)
  • [3] BBDM: Image-to-Image Translation with Brownian Bridge Diffusion Models
    Li, Bo
    Xue, Kaitao
    Liu, Bin
    Lai, Yu-Kun
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1952 - 1961
  • [4] Reversible GANs for Memory-efficient Image-to-Image Translation
    van der Ouderaa, Tycho F. A.
    Worrall, Daniel E.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4715 - 4723
  • [5] Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation
    Zhang, Linfeng
    Chen, Xin
    Tu, Xiaobing
    Wan, Pengfei
    Xu, Ning
    Ma, Kaisheng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12454 - 12464
  • [6] Conditional Image-to-Image translation
    Lin, Jianxin
    Xia, Yingce
    Qin, Tao
    Chen, Zhibo
    Liu, Tie-Yan
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5524 - 5532
  • [7] UGC: Unified GAN Compression for Efficient Image-to-Image Translation
    Ren, Yuxi
    Wu, Jie
    Zhang, Peng
    Zhang, Manlin
    Xiao, Xuefeng
    He, Qian
    Wang, Rui
    Zheng, Min
    Pan, Xin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17235 - 17245
  • [8] Hypercomplex Image-to-Image Translation
    Grassucci, Eleonora
    Sigillo, Luigi
    Uncini, Aurelio
    Comminiello, Danilo
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [9] RL-I2IT: Image-to-Image Translation with Deep Reinforcement Learning
    Wang, Xin
    Luo, Ziwei
    Hu, Jing
    Feng, Chengming
    Hu, Shu
    Zhu, Bin
    Wu, Xi
    Zhu, Hongtu
    Li, Xin
    Lyu, Siwei
    arXiv, 2023,
  • [10] IMAGE DATA AUGMENTATION WITH UNPAIRED IMAGE-TO-IMAGE CAMERA MODEL TRANSLATION
    Foo, Chi Fa
    Winkler, Stefan
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3246 - 3250