InstaFormer: Instance-Aware Image-to-Image Translation with Transformer

被引:23
|
作者
Kim, Soohyun [1 ]
Baek, Jongbeom [1 ]
Park, Jihye [1 ]
Kim, Gyeongnyeon [1 ]
Kim, Seungryong [1 ]
机构
[1] Korea Univ, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/CVPR52688.2022.01778
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel Transformer-based network architecture for instance-aware image-to-image translation, dubbed InstaFormer, to effectively integrate global- and instance-level information. By considering extracted content features from an image as tokens, our networks discover global consensus of content features by considering context information through a self-attention module in Transformers. By augmenting such tokens with an instance-level feature extracted from the content feature with respect to bounding box information, our framework is capable of learning an interaction between object instances and the global image, thus boosting the instance-awareness. We replace layer normalization (LayerNorm) in standard Transformers with adaptive instance normalization (AdaIN) to enable a multi-modal translation with style codes. In addition, to improve the instance-awareness and translation quality at object regions, we present an instance-level content contrastive loss defined between input and translated image. We conduct experiments to demonstrate the effectiveness of our InstaFormer over the latest methods and provide extensive ablation studies.
引用
收藏
页码:18300 / 18310
页数:11
相关论文
共 50 条
  • [31] Toward Multimodal Image-to-Image Translation
    Zhu, Jun-Yan
    Zhang, Richard
    Pathak, Deepak
    Darrell, Trevor
    Efros, Alexei A.
    Wang, Oliver
    Shechtman, Eli
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [32] Multimodal Unsupervised Image-to-Image Translation
    Huang, Xun
    Liu, Ming-Yu
    Belongie, Serge
    Kautz, Jan
    COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 179 - 196
  • [33] Domain Adaptive Image-to-image Translation
    Chen, Ying-Cong
    Xu, Xiaogang
    Jia, Jiaya
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5273 - 5282
  • [34] Unsupervised Image-to-Image Translation: A Review
    Hoyez, Henri
    Schockaert, Cedric
    Rambach, Jason
    Mirbach, Bruno
    Stricker, Didier
    SENSORS, 2022, 22 (21)
  • [35] Unsupervised Image-to-Image Translation Networks
    Liu, Ming-Yu
    Breuel, Thomas
    Kautz, Jan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [36] 3D-Aware Multi-Class Image-to-Image Translation with NeRFs
    Li, Senmao
    van de Weijer, Joost
    Wang, Yaxing
    Khan, Fahad Shahbaz
    Liu, Meiqin
    Yang, Jian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12652 - 12662
  • [37] A novel framework for image-to-image translation and image compression
    Yang, Fei
    Wang, Yaxing
    Herranz, Luis
    Cheng, Yongmei
    Mozerov, Mikhail G.
    NEUROCOMPUTING, 2022, 508 : 58 - 70
  • [38] Guided Image Weathering using Image-to-Image Translation
    Chen, Yu
    Shen, I-Chao
    Chen, Bing-Yu
    PROCEEDINGS OF SIGGRAPH ASIA 2021 TECHNICAL COMMUNICATIONS, 2021,
  • [39] Correction to: Generative image completion with image-to-image translation
    Shuzhen Xu
    Qing Zhu
    Jin Wang
    Neural Computing and Applications, 2020, 32 : 17809 - 17809
  • [40] Unsupervised Image-to-Image Translation with Generative Prior
    Yang, Shuai
    Jiang, Liming
    Liu, Ziwei
    Loy, Chen Change
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18311 - 18320