Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint

被引：15

作者：

Liu, Hongyu ^{[1
]}

Song, Yibing ^{[2
]}

Chen, Qifeng ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China

[2] Fudan Univ, AI Inst, Shanghai, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.00971

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

GAN inversion and editing via StyleGAN maps an input image into the embedding spaces (W, W+, and F) to simultaneously maintain image fidelity and meaningful manipulation. From latent space W to extended latent space W+ to feature space F in StyleGAN, the editability of GAN inversion decreases while its reconstruction quality increases. Recent GAN inversion methods typically explore W+ and F rather than W to improve reconstruction fidelity while maintaining editability. As W+ and F are derived from W that is essentially the foundation latent space of StyleGAN, these GAN inversion methods focusing on W+ and F spaces could be improved by stepping back to W. In this work, we propose to first obtain the proper latent code in foundation latent space W. We introduce contrastive learning to align W and the image space for proper latent code discovery. Then, we leverage a cross-attention encoder to transform the obtained latent code in W into W+ and F, accordingly. Our experiments show that our exploration of the foundation latent space W improves the representation ability of latent codes in W+ and features in F, which yields state-of-the-art reconstruction fidelity and editability results on the standard benchmarks. Project page: https://kumapowerliu.github.io/CLCAE.

引用

页码：10072 / 10082

页数：11

共 50 条

[41] IMAGE ORIENTATION BY EMBEDDING IN A GAN LATENT SPACE
Kniaz, V. V.
Knyaz, V. A.
Mizginov, V
Bordodymov, A.
Moshkantsev, P.
Barylnik, S.
Novikov, D.
OPTICAL 3D METROLOGY (O3DM), 2022, 48-2 (W2): : 142 - 148
[42] Face Editing Using Part-Based Optimization of the Latent Space
Aliari, Mohammad Amin
Beauchamp, Andre
Popa, Tiberiu
Paquette, Eric
COMPUTER GRAPHICS FORUM, 2023, 42 (02) : 269 - 279
[43] VideoMap: Supporting Video Editing Exploration, Brainstorming, and Prototyping in the Latent Space
Lin, David Chuan-En
Heilbron, Fabian Caba
Lee, Joon-Young
Wang, Oliver
Martelaro, Nikolas
PROCEEDINGS OF THE 16TH CONFERENCE ON CREATIVITY AND COGNITION, C&C 2024, 2024, : 311 - 327
[44] Debiased Noise Editing on Foundation Models for Fair Medical Image Classification
Jin, Ruinan
Deng, Wenlong
Chen, Minghui
Li, Xiaoxiao
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT X, 2024, 15010 : 164 - 174
[45] Latent Transformations via NeuralODEs for GAN-based Image Editing
Khrulkov, Valentin
Mirvakhabova, Leyla
Oseledets, Ivan
Babenko, Artem
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14408 - 14417
[46] Conditional reiterative High-Fidelity GAN inversion for image editing
Dere, Vedant Vasant
Shinde, Amita
Vast, Prachi
PATTERN RECOGNITION, 2024, 147
[47] ReGANIE: Rectifying GAN Inversion Errors for Accurate Real Image Editing
Li, Bingchuan
Ma, Tianxiang
Zhang, Peng
Hua, Miao
Liu, Wei
He, Qian
Yi, Zili
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 1269 - 1277
[48] Navigating the GAN Parameter Space for Semantic Image Editing
Cherepkov, Anton
Voynov, Andrey
Babenko, Artem
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3670 - 3679
[49] SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Color Editing
Shi, Jing
Xu, Ning
Zheng, Haitian
Smith, Alex
Luo, Jiebo
Xu, Chenliang
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19698 - 19707
[50] Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation
Chen, Ying-Cong
Xu, Xiaogang
Tian, Zhuotao
Jia, Jiaya
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2403 - 2411

← 1 2 3 4 5 →