Conditional Score Guidance for Text-Driven Image-to-Image Translation

被引:0
|
作者
Lee, Hyunsoo [1 ]
Kang, Minsoo [1 ]
Han, Bohyung [1 ,2 ]
机构
[1] Seoul Natl Univ, ECE, Seoul, South Korea
[2] Seoul Natl Univ, IPAI, Seoul, South Korea
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel algorithm for text-driven image-to-image translation based on a pretrained text-to-image diffusion model. Our method aims to generate a target image by selectively editing regions of interest in a source image, defined by a modifying text, while preserving the remaining parts. In contrast to existing techniques that solely rely on a target prompt, we introduce a new score function that additionally considers both the source image and the source text prompt, tailored to address specific translation tasks. To this end, we derive the conditional score function in a principled way, decomposing it into the standard score and a guiding term for target image generation. For the gradient computation about the guiding term, we assume a Gaussian distribution for the posterior distribution and estimate its mean and variance to adjust the gradient without additional training. In addition, to improve the quality of the conditional score guidance, we incorporate a simple yet effective mixup technique, which combines two cross-attention maps derived from the source and target latents. This strategy is effective for promoting a desirable fusion of the invariant parts in the source image and the edited regions aligned with the target prompt, leading to high-fidelity target image generation. Through comprehensive experiments, we demonstrate that our approach achieves outstanding image-to-image translation performance on various tasks. Code is available at https://github.com/Hleephilip/CSG.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Unsupervised Image-to-Image Translation with Style Consistency
    Lai, Binxin
    Wang, Yuan-Gen
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 322 - 334
  • [42] Breaking the Dilemma of Medical Image-to-image Translation
    Kong, Lingke
    Lian, Chenyu
    Huang, Detian
    Li, Zhenjiang
    Hu, Yanle
    Zhou, Qichao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [43] Random Reconstructed Unpaired Image-to-Image Translation
    Zhang, Xiaoqin
    Fan, Chenxiang
    Xiao, Zhiheng
    Zhao, Li
    Chen, Huiling
    Chang, Xiaojun
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (03) : 3144 - 3154
  • [44] Edge Sensitive Unsupervised Image-to-Image Translation
    Akkaya, Ibrahim Batuhan
    Halici, Ugur
    2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
  • [45] Research on Image-to-Image Translation with Capsule Network
    Ye, Jian
    Chang, Qing
    Jia, Xiaotian
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: THEORETICAL NEURAL COMPUTATION, PT I, 2019, 11727 : 141 - 151
  • [46] Zero-shot Image-to-Image Translation
    Parmar, Gaurav
    Singh, Krishna Kumar
    Zhang, Richard
    Li, Yijun
    Lu, Jingwan
    Zhu, Jun-Yan
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [47] Rethinking the Truly Unsupervised Image-to-Image Translation
    Baek, Kyungjune
    Choi, Yunjey
    Uh, Youngjung
    Yoo, Jaejun
    Shim, Hyunjung
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14134 - 14143
  • [48] Unpaired image-to-image translation of structural damage
    Varghese, Subin
    Hoskere, Vedhus
    ADVANCED ENGINEERING INFORMATICS, 2023, 56
  • [49] Equivariant Adversarial Network for Image-to-image Translation
    Zareapoor, Masoumeh
    Yang, Jie
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (02)
  • [50] Visualization Techniques applied to Image-to-Image Translation
    Protas, Eglen
    Bratti, Jose
    Ribeiro, Pedro O. C. S.
    Drews-, Paulo, Jr.
    Botelho, Silvia
    2018 7TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2018, : 242 - 247