InstaFormer: Instance-Aware Image-to-Image Translation with Transformer

被引：23

作者：

Kim, Soohyun ^{[1
]}

Baek, Jongbeom ^{[1
]}

Park, Jihye ^{[1
]}

Kim, Gyeongnyeon ^{[1
]}

Kim, Seungryong ^{[1
]}

机构：

[1] Korea Univ, Seoul, South Korea

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年

基金：

新加坡国家研究基金会;

关键词：

D O I：

10.1109/CVPR52688.2022.01778

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a novel Transformer-based network architecture for instance-aware image-to-image translation, dubbed InstaFormer, to effectively integrate global- and instance-level information. By considering extracted content features from an image as tokens, our networks discover global consensus of content features by considering context information through a self-attention module in Transformers. By augmenting such tokens with an instance-level feature extracted from the content feature with respect to bounding box information, our framework is capable of learning an interaction between object instances and the global image, thus boosting the instance-awareness. We replace layer normalization (LayerNorm) in standard Transformers with adaptive instance normalization (AdaIN) to enable a multi-modal translation with style codes. In addition, to improve the instance-awareness and translation quality at object regions, we present an instance-level content contrastive loss defined between input and translated image. We conduct experiments to demonstrate the effectiveness of our InstaFormer over the latest methods and provide extensive ablation studies.

引用

页码：18300 / 18310

页数：11

共 50 条

[31] Toward Multimodal Image-to-Image Translation
Zhu, Jun-Yan
Zhang, Richard
Pathak, Deepak
Darrell, Trevor
Efros, Alexei A.
Wang, Oliver
Shechtman, Eli
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[32] Multimodal Unsupervised Image-to-Image Translation
Huang, Xun
Liu, Ming-Yu
Belongie, Serge
Kautz, Jan
COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 179 - 196
[33] Domain Adaptive Image-to-image Translation
Chen, Ying-Cong
Xu, Xiaogang
Jia, Jiaya
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5273 - 5282
[34] Unsupervised Image-to-Image Translation: A Review
Hoyez, Henri
Schockaert, Cedric
Rambach, Jason
Mirbach, Bruno
Stricker, Didier
SENSORS, 2022, 22 (21)
[35] Unsupervised Image-to-Image Translation Networks
Liu, Ming-Yu
Breuel, Thomas
Kautz, Jan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[36] 3D-Aware Multi-Class Image-to-Image Translation with NeRFs
Li, Senmao
van de Weijer, Joost
Wang, Yaxing
Khan, Fahad Shahbaz
Liu, Meiqin
Yang, Jian
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12652 - 12662
[37] A novel framework for image-to-image translation and image compression
Yang, Fei
Wang, Yaxing
Herranz, Luis
Cheng, Yongmei
Mozerov, Mikhail G.
NEUROCOMPUTING, 2022, 508 : 58 - 70
[38] Guided Image Weathering using Image-to-Image Translation
Chen, Yu
Shen, I-Chao
Chen, Bing-Yu
PROCEEDINGS OF SIGGRAPH ASIA 2021 TECHNICAL COMMUNICATIONS, 2021,
[39] Correction to: Generative image completion with image-to-image translation
Shuzhen Xu
Qing Zhu
Jin Wang
Neural Computing and Applications, 2020, 32 : 17809 - 17809
[40] Unsupervised Image-to-Image Translation with Generative Prior
Yang, Shuai
Jiang, Liming
Liu, Ziwei
Loy, Chen Change
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18311 - 18320

← 1 2 3 4 5 →