A Diffusion Model Translator for Efficient Image-to-Image Translation

被引：3

作者：

Xia, Mengfei ^{[1
]}

Zhou, Yu ^{[1
]}

Yi, Ran ^{[2
]}

Liu, Yong-Jin ^{[1
]}

Wang, Wenping ^{[3
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, MOE Key Lab Pervas Comp, Beijing 100084, Peoples R China

[2] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China

[3] Texas A&M Univ, Dept Comp Sci & Comp Engn, College Stn, TX 77840 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 12期

基金：

北京市自然科学基金; 中国国家自然科学基金;

关键词：

Task analysis; Noise reduction; Diffusion models; Diffusion processes; Training; Computer science; Trajectory; image translation; deep learning; generative models;

D O I：

10.1109/TPAMI.2024.3435448

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Applying diffusion models to image-to-image translation (I2I) has recently received increasing attention due to its practical applications. Previous attempts inject information from the source image into each denoising step for an iterative refinement, thus resulting in a time-consuming implementation. We propose an efficient method that equips a diffusion model with a lightweight translator, dubbed a Diffusion Model Translator (DMT), to accomplish I2I. Specifically, we first offer theoretical justification that in employing the pioneering DDPM work for the I2I task, it is both feasible and sufficient to transfer the distribution from one domain to another only at some intermediate step. We further observe that the translation performance highly depends on the chosen timestep for domain transfer, and therefore propose a practical strategy to automatically select an appropriate timestep for a given task. We evaluate our approach on a range of I2I applications, including image stylization, image colorization, segmentation to image, and sketch to image, to validate its efficacy and general utility. The comparisons show that our DMT surpasses existing methods in both quality and efficiency. Code is available at https://github.com/THU-LYJ-Lab/dmt.

引用

页码：10272 / 10283

页数：12

共 50 条

[21] Domain Adaptive Image-to-image Translation
Chen, Ying-Cong
Xu, Xiaogang
Jia, Jiaya
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5273 - 5282
[22] Unsupervised Image-to-Image Translation: A Review
Hoyez, Henri
Schockaert, Cedric
Rambach, Jason
Mirbach, Bruno
Stricker, Didier
SENSORS, 2022, 22 (21)
[23] Unsupervised Image-to-Image Translation Networks
Liu, Ming-Yu
Breuel, Thomas
Kautz, Jan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[24] A novel framework for image-to-image translation and image compression
Yang, Fei
Wang, Yaxing
Herranz, Luis
Cheng, Yongmei
Mozerov, Mikhail G.
NEUROCOMPUTING, 2022, 508 : 58 - 70
[25] Guided Image Weathering using Image-to-Image Translation
Chen, Yu
Shen, I-Chao
Chen, Bing-Yu
PROCEEDINGS OF SIGGRAPH ASIA 2021 TECHNICAL COMMUNICATIONS, 2021,
[26] Correction to: Generative image completion with image-to-image translation
Shuzhen Xu
Qing Zhu
Jin Wang
Neural Computing and Applications, 2020, 32 : 17809 - 17809
[27] Multidomain image-to-image translation model based on hidden space sharing
Ding Yuxin
Wang Longfei
Neural Computing and Applications, 2022, 34 : 283 - 298
[28] Unsupervised Image-to-Image Translation with Generative Prior
Yang, Shuai
Jiang, Liming
Liu, Ziwei
Loy, Chen Change
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18311 - 18320
[29] Leveraging Local Domains for Image-to-Image Translation
Dell'Eva, Anthony
Pizzati, Fabio
Bertozzi, Massimo
de Charette, Raoul
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 179 - 189
[30] Multidomain image-to-image translation model based on hidden space sharing
Ding, Yuxin
Wang, Longfei
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (01): : 283 - 298

← 1 2 3 4 5 →