Style-Guided Inference of Transformer for High-resolution Image Synthesis

被引:0
|
作者
Yim, Jonghwa [1 ]
Kim, Minjae [1 ]
机构
[1] NCSOFT, AI Ctr, Vis AI Lab, Seoul, South Korea
关键词
D O I
10.1109/WACV56688.2023.00179
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer is eminently suitable for auto-regressive image synthesis which predicts discrete value from the past values recursively to make up full image. Especially, combined with vector quantised latent representation, the state-of-the-art auto-regressive transformer displays realistic high-resolution images. However, sampling the latent code from discrete probability distribution makes the output unpredictable. Therefore, it requires to generate lots of diverse samples to acquire desired outputs. To alleviate the process of generating lots of samples repetitively, in this article, we propose to take a desired output, a style image, as an additional condition without re-training the transformer. To this end, our method transfers the style to a probability constraint to re-balance the prior, thereby specifying the target distribution instead of the original prior. Thus, generated samples from the re-balanced prior have similar styles to reference style. In practice, we can choose either an image or a category of images as an additional condition. In our qualitative assessment, we show that styles of majority of outputs are similar to the input style.
引用
收藏
页码:1745 / 1755
页数:11
相关论文
共 50 条
  • [41] Transformer-Driven Semantic Relation Inference for Multilabel Classification of High-Resolution Remote Sensing Images
    Tan, Xiaowei
    Xiao, Zhifeng
    Zhu, Jianjun
    Wan, Qiao
    Wang, Kai
    Li, Deren
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 1884 - 1901
  • [42] High-resolution DEM building with SAR interferometry and high-resolution optical image
    Hadj-Sahraoui, Omar
    Fizazi, Hadria
    Berrichi, Faouzi
    Chamakhi, Djemoui
    Kebir, Lahcen Wahib
    IET IMAGE PROCESSING, 2019, 13 (05) : 713 - 721
  • [43] Guided M-Net for High-Resolution Biomedical Image Segmentation with Weak Boundaries
    Zhang, Shihao
    Yan, Yuguang
    Yin, Pengshuai
    Qiu, Zhen
    Zhao, Wei
    Cao, Guiping
    Chen, Wan
    Yuan, Jin
    Higashita, Risa
    Wu, Qingyao
    Tan, Mingkui
    Liu, Jiang
    OPHTHALMIC MEDICAL IMAGE ANALYSIS, 2019, 11855 : 43 - 51
  • [44] Optimal segmentation of a high-resolution remote-sensing image guided by area and boundary
    Chen, Jie
    Deng, Min
    Mei, Xiaoming
    Chen, Tieqiao
    Shao, Quanbin
    Hong, Liang
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2014, 35 (19) : 6914 - 6939
  • [45] High-Resolution Diabetic Retinopathy Image Synthesis Manipulated by Grading and Lesions
    Zhou, Yi
    He, Xiaodong
    Cui, Shanshan
    Zhu, Fan
    Liu, Li
    Shao, Ling
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT I, 2019, 11764 : 505 - 513
  • [46] High-resolution dermoscopy image synthesis with conditional generative adversarial networks
    Ding, Saisai
    Zheng, Jian
    Liu, Zhaobang
    Zheng, Yanyan
    Chen, Yanmei
    Xu, Xiaomin
    Lu, Jia
    Xie, Jing
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 64
  • [47] Cross-Guided Optimization of Radiance Fields with Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis
    Yoon, Youngho
    Yoon, Kuk-Jin
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12428 - 12438
  • [48] Swin-UNIT: Transformer-based GAN for High-resolution Unpaired Image Translation
    Li, Yifan
    Li, Yaochen
    Tang, Wenneng
    Zhu, Zhifeng
    Yang, Jinhuo
    Liu, Yuehu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4657 - 4665
  • [49] High-resolution image reflection removal by Laplacian-based component-aware transformer
    Chen, Songnan
    Feng, Zhaoxu
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [50] Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding
    Zhang, Pengchuan
    Dai, Xiyang
    Yang, Jianwei
    Xiao, Bin
    Yuan, Lu
    Zhang, Lei
    Gao, Jianfeng
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2978 - 2988