Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint

被引：15

作者：

Liu, Hongyu ^{[1
]}

Song, Yibing ^{[2
]}

Chen, Qifeng ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China

[2] Fudan Univ, AI Inst, Shanghai, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.00971

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

GAN inversion and editing via StyleGAN maps an input image into the embedding spaces (W, W+, and F) to simultaneously maintain image fidelity and meaningful manipulation. From latent space W to extended latent space W+ to feature space F in StyleGAN, the editability of GAN inversion decreases while its reconstruction quality increases. Recent GAN inversion methods typically explore W+ and F rather than W to improve reconstruction fidelity while maintaining editability. As W+ and F are derived from W that is essentially the foundation latent space of StyleGAN, these GAN inversion methods focusing on W+ and F spaces could be improved by stepping back to W. In this work, we propose to first obtain the proper latent code in foundation latent space W. We introduce contrastive learning to align W and the image space for proper latent code discovery. Then, we leverage a cross-attention encoder to transform the obtained latent code in W into W+ and F, accordingly. Our experiments show that our exploration of the foundation latent space W improves the representation ability of latent codes in W+ and features in F, which yields state-of-the-art reconstruction fidelity and editability results on the standard benchmarks. Project page: https://kumapowerliu.github.io/CLCAE.

引用

页码：10072 / 10082

页数：11

共 50 条

[1] Transforming the latent space of StyleGAN for real face editing
Heyi Li
Jinlong Liu
Xinyu Zhang
Yunzhi Bai
Huayan Wang
Klaus Mueller
The Visual Computer, 2024, 40 : 3553 - 3568
[2] Transforming the latent space of StyleGAN for real face editing
Li, Heyi
Liu, Jinlong
Zhang, Xinyu
Bai, Yunzhi
Wang, Huayan
Mueller, Klaus
VISUAL COMPUTER, 2024, 40 (05): : 3553 - 3568
[3] Everything is There in Latent Space: Attribute Editing and Attribute Style Manipulation by StyleGAN Latent Space Exploration
Parihar, Rishubh
Dhiman, Ankit
Karmali, Tejan
Babu, R. Venkatesh
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1828 - 1836
[4] Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
Abdal, Rameen
Qin, Yipeng
Wonka, Peter
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4431 - 4440
[5] Improved Attribute Manipulation in the Latent Space of StyleGAN for Semantic Face Editing
Rai, Aashish
Ducher, Clara
Cooperstock, Jeremy R.
20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 38 - 43
[6] User-Controllable Latent Transformer for StyleGAN Image Layout Editing
Endo, Y.
COMPUTER GRAPHICS FORUM, 2022, 41 (07) : 395 - 406
[7] Warping the Residuals for Image Editing with StyleGAN
Yildirim, Ahmet Burak
Pehlivan, Hamza
Dundar, Aysegul
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 2311 - 2326
[8] CLIP-guided StyleGAN Inversion for Text-driven Real Image Editing
Baykal, Ahmet Canberk
Anees, Abdul Basit
Ceylan, Duygu
Erdem, Erkut
Erdem, Aykut
Yuret, Deniz
ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (05):
[9] SEMANTIC UNFOLDING OF STYLEGAN LATENT SPACE
Shukor, Mustafa
Yao, Xu
Damodaran, Bharath Bushan
Hellier, Pierre
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 221 - 225
[10] The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 9337 - 9346

← 1 2 3 4 5 →