InstaFormer: Instance-Aware Image-to-Image Translation with Transformer

被引:23
|
作者
Kim, Soohyun [1 ]
Baek, Jongbeom [1 ]
Park, Jihye [1 ]
Kim, Gyeongnyeon [1 ]
Kim, Seungryong [1 ]
机构
[1] Korea Univ, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/CVPR52688.2022.01778
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel Transformer-based network architecture for instance-aware image-to-image translation, dubbed InstaFormer, to effectively integrate global- and instance-level information. By considering extracted content features from an image as tokens, our networks discover global consensus of content features by considering context information through a self-attention module in Transformers. By augmenting such tokens with an instance-level feature extracted from the content feature with respect to bounding box information, our framework is capable of learning an interaction between object instances and the global image, thus boosting the instance-awareness. We replace layer normalization (LayerNorm) in standard Transformers with adaptive instance normalization (AdaIN) to enable a multi-modal translation with style codes. In addition, to improve the instance-awareness and translation quality at object regions, we present an instance-level content contrastive loss defined between input and translated image. We conduct experiments to demonstrate the effectiveness of our InstaFormer over the latest methods and provide extensive ablation studies.
引用
收藏
页码:18300 / 18310
页数:11
相关论文
共 50 条
  • [1] InstaFormer++: Multi-Domain Instance-Aware Image-to-Image Translation with Transformer
    Soohyun Kim
    Jongbeom Baek
    Jihye Park
    Eunjae Ha
    Homin Jung
    Taeyoung Lee
    Seungryong Kim
    International Journal of Computer Vision, 2024, 132 : 1167 - 1186
  • [2] InstaFormer plus plus : Multi-Domain Instance-Aware Image-to-Image Translation with Transformer
    Kim, Soohyun
    Baek, Jongbeom
    Park, Jihye
    Ha, Eunjae
    Jung, Homin
    Lee, Taeyoung
    Kim, Seungryong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (04) : 1167 - 1186
  • [3] RHN: RoI Restricted Hybrid Network for Instance-Aware Image-to-Image Translation
    Liu, Yaqi
    Wang, Hanhan
    Zhang, Jianyi
    Xiao, Song
    Cai, Qiang
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 1156 - 1160
  • [4] Instance-aware Image Colorization
    Su, Jheng-Wei
    Chu, Hung-Kuo
    Huang, Jia-Bin
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 7965 - 7974
  • [5] Instance-aware image dehazing
    Chao, Qingqing
    Yan, Jinqiang
    Sun, Tianmeng
    Li, Silong
    Chi, Jieru
    Yang, Guowei
    Chen, Chenglizhao
    Yu, Teng
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [6] Towards Instance-level Image-to-Image Translation
    Shen, Zhiqiang
    Huang, Mingyang
    Shi, Jianping
    Xue, Xiangyang
    Huang, Thomas
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3678 - 3687
  • [7] Panoptic-aware Image-to-Image Translation
    Zhang, Liyun
    Ratsamee, Photchara
    Wang, Bowen
    Luo, Zhaojie
    Uranishi, Yuki
    Higashida, Manabu
    Takemura, Haruo
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 259 - 268
  • [8] Quality-Aware Unpaired Image-to-Image Translation
    Chen, Lei
    Wu, Le
    Hu, Zhenzhen
    Wang, Meng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (10) : 2664 - 2674
  • [9] Geometry-Aware Eye Image-To-Image Translation
    Lu, Conny
    Zhang, Qian
    Krishnakumar, Kapil
    Chen, Jixu
    Fuchs, Henry
    Talathi, Sachin
    Liu, Kun
    2022 ACM SYMPOSIUM ON EYE TRACKING RESEARCH AND APPLICATIONS, ETRA 2022, 2022,
  • [10] Unsupervised Exemplar-Domain Aware Image-to-Image Translation
    Fu, Yuanbin
    Ma, Jiayi
    Guo, Xiaojie
    ENTROPY, 2021, 23 (05)