Style-Guided Inference of Transformer for High-resolution Image Synthesis

被引:0
|
作者
Yim, Jonghwa [1 ]
Kim, Minjae [1 ]
机构
[1] NCSOFT, AI Ctr, Vis AI Lab, Seoul, South Korea
关键词
D O I
10.1109/WACV56688.2023.00179
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer is eminently suitable for auto-regressive image synthesis which predicts discrete value from the past values recursively to make up full image. Especially, combined with vector quantised latent representation, the state-of-the-art auto-regressive transformer displays realistic high-resolution images. However, sampling the latent code from discrete probability distribution makes the output unpredictable. Therefore, it requires to generate lots of diverse samples to acquire desired outputs. To alleviate the process of generating lots of samples repetitively, in this article, we propose to take a desired output, a style image, as an additional condition without re-training the transformer. To this end, our method transfers the style to a probability constraint to re-balance the prior, thereby specifying the target distribution instead of the original prior. Thus, generated samples from the re-balanced prior have similar styles to reference style. In practice, we can choose either an image or a category of images as an additional condition. In our qualitative assessment, we show that styles of majority of outputs are similar to the input style.
引用
收藏
页码:1745 / 1755
页数:11
相关论文
共 50 条
  • [21] StyleSwin: Transformer-based GAN for High-resolution Image Generation
    Zhang, Bowen
    Gu, Shuyang
    Zhang, Bo
    Bao, Jianmin
    Chen, Dong
    Wen, Fang
    Wang, Yong
    Guo, Baining
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11294 - 11304
  • [22] HSSAN: hair synthesis with style-guided spatially adaptive normalization on generative adversarial network
    Hu, Xinrong
    Chang, Qing
    Huang, Junjie
    Luo, Ruiqi
    Wang, Bangchao
    Hu, Chang
    VISUAL COMPUTER, 2023, 39 (08): : 3311 - 3318
  • [23] StyleGuide: Zero-Shot Sketch-Based Image Retrieval Using Style-Guided Image Generation
    Dutta, Titir
    Singh, Anurag
    Biswas, Soma
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2833 - 2842
  • [24] HSSAN: hair synthesis with style-guided spatially adaptive normalization on generative adversarial network
    Xinrong Hu
    Qing Chang
    Junjie Huang
    Ruiqi Luo
    Bangchao Wang
    Chang Hu
    The Visual Computer, 2023, 39 : 3311 - 3318
  • [25] THEOREM FOR HIGH-RESOLUTION HIGH-CONTRAST IMAGE SYNTHESIS
    BUCKLEW, JA
    SALEH, BEA
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1985, 2 (08): : 1233 - 1236
  • [26] High-Resolution Image Synthesis with Latent Diffusion Models
    Rombach, Robin
    Blattmann, Andreas
    Lorenz, Dominik
    Esser, Patrick
    Ommer, Bjoern
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10674 - 10685
  • [27] Improved Transformer for High-Resolution GANs
    Zhao, Long
    Zhang, Zizhao
    Chen, Ting
    Metaxas, Dimitris N.
    Zhang, Han
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [28] ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image Translation
    Liu, Yahui
    Chen, Yajing
    Bao, Linchao
    Sebe, Nicu
    Lepri, Bruno
    De Nadai, Marco
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3343 - 3353
  • [29] Synthesis of a high-resolution 3D-stereoscopic image pair from a high-resolution monoscopic image and a low-resolution depth map
    Kim, KT
    Siegel, M
    Son, JY
    STEREOSCOPIC DISPLAYS AND VIRTUAL REALITY SYSTEMS V, 1998, 3295 : 76 - 86
  • [30] High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
    Wang, Ting-Chun
    Liu, Ming-Yu
    Zhu, Jun-Yan
    Tao, Andrew
    Kautz, Jan
    Catanzaro, Bryan
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8798 - 8807