Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint

被引:15
|
作者
Liu, Hongyu [1 ]
Song, Yibing [2 ]
Chen, Qifeng [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Fudan Univ, AI Inst, Shanghai, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.00971
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
GAN inversion and editing via StyleGAN maps an input image into the embedding spaces (W, W+, and F) to simultaneously maintain image fidelity and meaningful manipulation. From latent space W to extended latent space W+ to feature space F in StyleGAN, the editability of GAN inversion decreases while its reconstruction quality increases. Recent GAN inversion methods typically explore W+ and F rather than W to improve reconstruction fidelity while maintaining editability. As W+ and F are derived from W that is essentially the foundation latent space of StyleGAN, these GAN inversion methods focusing on W+ and F spaces could be improved by stepping back to W. In this work, we propose to first obtain the proper latent code in foundation latent space W. We introduce contrastive learning to align W and the image space for proper latent code discovery. Then, we leverage a cross-attention encoder to transform the obtained latent code in W into W+ and F, accordingly. Our experiments show that our exploration of the foundation latent space W improves the representation ability of latent codes in W+ and features in F, which yields state-of-the-art reconstruction fidelity and editability results on the standard benchmarks. Project page: https://kumapowerliu.github.io/CLCAE.
引用
收藏
页码:10072 / 10082
页数:11
相关论文
共 50 条
  • [41] IMAGE ORIENTATION BY EMBEDDING IN A GAN LATENT SPACE
    Kniaz, V. V.
    Knyaz, V. A.
    Mizginov, V
    Bordodymov, A.
    Moshkantsev, P.
    Barylnik, S.
    Novikov, D.
    OPTICAL 3D METROLOGY (O3DM), 2022, 48-2 (W2): : 142 - 148
  • [42] Face Editing Using Part-Based Optimization of the Latent Space
    Aliari, Mohammad Amin
    Beauchamp, Andre
    Popa, Tiberiu
    Paquette, Eric
    COMPUTER GRAPHICS FORUM, 2023, 42 (02) : 269 - 279
  • [43] VideoMap: Supporting Video Editing Exploration, Brainstorming, and Prototyping in the Latent Space
    Lin, David Chuan-En
    Heilbron, Fabian Caba
    Lee, Joon-Young
    Wang, Oliver
    Martelaro, Nikolas
    PROCEEDINGS OF THE 16TH CONFERENCE ON CREATIVITY AND COGNITION, C&C 2024, 2024, : 311 - 327
  • [44] Debiased Noise Editing on Foundation Models for Fair Medical Image Classification
    Jin, Ruinan
    Deng, Wenlong
    Chen, Minghui
    Li, Xiaoxiao
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT X, 2024, 15010 : 164 - 174
  • [45] Latent Transformations via NeuralODEs for GAN-based Image Editing
    Khrulkov, Valentin
    Mirvakhabova, Leyla
    Oseledets, Ivan
    Babenko, Artem
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14408 - 14417
  • [46] Conditional reiterative High-Fidelity GAN inversion for image editing
    Dere, Vedant Vasant
    Shinde, Amita
    Vast, Prachi
    PATTERN RECOGNITION, 2024, 147
  • [47] ReGANIE: Rectifying GAN Inversion Errors for Accurate Real Image Editing
    Li, Bingchuan
    Ma, Tianxiang
    Zhang, Peng
    Hua, Miao
    Liu, Wei
    He, Qian
    Yi, Zili
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 1269 - 1277
  • [48] Navigating the GAN Parameter Space for Semantic Image Editing
    Cherepkov, Anton
    Voynov, Andrey
    Babenko, Artem
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3670 - 3679
  • [49] SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Color Editing
    Shi, Jing
    Xu, Ning
    Zheng, Haitian
    Smith, Alex
    Luo, Jiebo
    Xu, Chenliang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19698 - 19707
  • [50] Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation
    Chen, Ying-Cong
    Xu, Xiaogang
    Tian, Zhuotao
    Jia, Jiaya
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2403 - 2411