Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint

被引：15

作者：

Liu, Hongyu ^{[1
]}

Song, Yibing ^{[2
]}

Chen, Qifeng ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China

[2] Fudan Univ, AI Inst, Shanghai, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.00971

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

GAN inversion and editing via StyleGAN maps an input image into the embedding spaces (W, W+, and F) to simultaneously maintain image fidelity and meaningful manipulation. From latent space W to extended latent space W+ to feature space F in StyleGAN, the editability of GAN inversion decreases while its reconstruction quality increases. Recent GAN inversion methods typically explore W+ and F rather than W to improve reconstruction fidelity while maintaining editability. As W+ and F are derived from W that is essentially the foundation latent space of StyleGAN, these GAN inversion methods focusing on W+ and F spaces could be improved by stepping back to W. In this work, we propose to first obtain the proper latent code in foundation latent space W. We introduce contrastive learning to align W and the image space for proper latent code discovery. Then, we leverage a cross-attention encoder to transform the obtained latent code in W into W+ and F, accordingly. Our experiments show that our exploration of the foundation latent space W improves the representation ability of latent codes in W+ and features in F, which yields state-of-the-art reconstruction fidelity and editability results on the standard benchmarks. Project page: https://kumapowerliu.github.io/CLCAE.

引用

页码：10072 / 10082

页数：11

共 50 条

[31] E2F-Net: Eyes-to-face inpainting via StyleGAN latent space
Hassanpour, Ahmad
Jamalbafrani, Fatemeh
Yang, Bian
Raja, Kiran
Veldhuis, Raymond
Fierrez, Julian
PATTERN RECOGNITION, 2024, 152
[32] Enhancing robustness to novel visual defects through StyleGAN latent space navigation: a manufacturing use case
Theodoropoulos, Spyros
Dardanis, Dimitrios
Makridis, Georgios
Zajec, Patrik
Rozanec, Joze M.
Kyriazis, Dimosthenis
Tsanakas, Panayiotis
JOURNAL OF INTELLIGENT MANUFACTURING, 2024,
[33] Towards Disentangling Latent Space for Unsupervised Semantic Face Editing
Liu, Kanglin
Cao, Gaofeng
Zhou, Fei
Liu, Bozhi
Duan, Jiang
Qiu, Guoping
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1475 - 1489
[34] Facial Attribute Editing by Latent Space Adversarial Variational Autoencoders
Li, Defang
Zhang, Min
Chen, Weifu
Feng, Guocan
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1337 - 1342
[35] High-Fidelity GAN Inversion for Image Attribute Editing
Wang, Tengfei
Zhang, Yong
Fan, Yanbo
Wang, Jue
Chen, Qifeng
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11369 - 11378
[36] Effective Real Image Editing with Accelerated Iterative Diffusion Inversion
Pan, Zhihong
Gherardi, Riccardo
Xie, Xiufeng
Huang, Stephen
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15866 - 15875
[37] Visual Instruction Inversion: Image Editing via Visual Prompting
Nguyen, Thao
Li, Yuheng
Ojha, Utkarsh
Lee, Yong Jae
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[38] Does image editing improve the quality of latent prints? An analysis of image-editing techniques in one crime laboratory
Gardner, Brett O.
Neuman, Maddisen
Kelley, Sharon
Hong, Anni
Mejia, Robin
SCIENCE & JUSTICE, 2023, 63 (01) : 109 - 115
[39] Autoencoder Image Interpolation by Shaping the Latent Space
Oring, Alon
Yakhini, Zohar
Hel-Or, Yacov
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[40] Explainability in image captioning based on the latent space
Elguendouze, Sofiane
Hafiane, Adel
de Souto, Marcilio C. P.
Halftermeyer, Anais
NEUROCOMPUTING, 2023, 546

← 1 2 3 4 5 →