Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint

被引:15
|
作者
Liu, Hongyu [1 ]
Song, Yibing [2 ]
Chen, Qifeng [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Fudan Univ, AI Inst, Shanghai, Peoples R China
关键词
D O I
10.1109/CVPR52729.2023.00971
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
GAN inversion and editing via StyleGAN maps an input image into the embedding spaces (W, W+, and F) to simultaneously maintain image fidelity and meaningful manipulation. From latent space W to extended latent space W+ to feature space F in StyleGAN, the editability of GAN inversion decreases while its reconstruction quality increases. Recent GAN inversion methods typically explore W+ and F rather than W to improve reconstruction fidelity while maintaining editability. As W+ and F are derived from W that is essentially the foundation latent space of StyleGAN, these GAN inversion methods focusing on W+ and F spaces could be improved by stepping back to W. In this work, we propose to first obtain the proper latent code in foundation latent space W. We introduce contrastive learning to align W and the image space for proper latent code discovery. Then, we leverage a cross-attention encoder to transform the obtained latent code in W into W+ and F, accordingly. Our experiments show that our exploration of the foundation latent space W improves the representation ability of latent codes in W+ and features in F, which yields state-of-the-art reconstruction fidelity and editability results on the standard benchmarks. Project page: https://kumapowerliu.github.io/CLCAE.
引用
收藏
页码:10072 / 10082
页数:11
相关论文
共 50 条
  • [1] Transforming the latent space of StyleGAN for real face editing
    Heyi Li
    Jinlong Liu
    Xinyu Zhang
    Yunzhi Bai
    Huayan Wang
    Klaus Mueller
    The Visual Computer, 2024, 40 : 3553 - 3568
  • [2] Transforming the latent space of StyleGAN for real face editing
    Li, Heyi
    Liu, Jinlong
    Zhang, Xinyu
    Bai, Yunzhi
    Wang, Huayan
    Mueller, Klaus
    VISUAL COMPUTER, 2024, 40 (05): : 3553 - 3568
  • [3] Everything is There in Latent Space: Attribute Editing and Attribute Style Manipulation by StyleGAN Latent Space Exploration
    Parihar, Rishubh
    Dhiman, Ankit
    Karmali, Tejan
    Babu, R. Venkatesh
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1828 - 1836
  • [4] Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
    Abdal, Rameen
    Qin, Yipeng
    Wonka, Peter
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4431 - 4440
  • [5] Improved Attribute Manipulation in the Latent Space of StyleGAN for Semantic Face Editing
    Rai, Aashish
    Ducher, Clara
    Cooperstock, Jeremy R.
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 38 - 43
  • [6] User-Controllable Latent Transformer for StyleGAN Image Layout Editing
    Endo, Y.
    COMPUTER GRAPHICS FORUM, 2022, 41 (07) : 395 - 406
  • [7] Warping the Residuals for Image Editing with StyleGAN
    Yildirim, Ahmet Burak
    Pehlivan, Hamza
    Dundar, Aysegul
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 2311 - 2326
  • [8] CLIP-guided StyleGAN Inversion for Text-driven Real Image Editing
    Baykal, Ahmet Canberk
    Anees, Abdul Basit
    Ceylan, Duygu
    Erdem, Erkut
    Erdem, Aykut
    Yuret, Deniz
    ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (05):
  • [9] SEMANTIC UNFOLDING OF STYLEGAN LATENT SPACE
    Shukor, Mustafa
    Yao, Xu
    Damodaran, Bharath Bushan
    Hellier, Pierre
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 221 - 225
  • [10] The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 9337 - 9346