Controllable image generation based on causal representation learning

被引:2
|
作者
Huang, Shanshan [1 ]
Wang, Yuanhao [1 ]
Gong, Zhili [1 ]
Liao, Jun [1 ]
Wang, Shu [2 ]
Liu, Li [1 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 401331, Peoples R China
[2] Southwest Univ, Sch Mat & Energy, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
Image generation; Controllable image editing; Causal structure learning; Causal representation learning; MODEL;
D O I
10.1631/FITEE.2300303
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Artificial intelligence generated content (AIGC) has emerged as an indispensable tool for producing large-scale content in various forms, such as images, thanks to the significant role that AI plays in imitation and production. However, interpretability and controllability remain challenges. Existing AI methods often face challenges in producing images that are both flexible and controllable while considering causal relationships within the images. To address this issue, we have developed a novel method for causal controllable image generation (CCIG) that combines causal representation learning with bi-directional generative adversarial networks (GANs). This approach enables humans to control image attributes while considering the rationality and interpretability of the generated images and also allows for the generation of counterfactual images. The key of our approach, CCIG, lies in the use of a causal structure learning module to learn the causal relationships between image attributes and joint optimization with the encoder, generator, and joint discriminator in the image generation module. By doing so, we can learn causal representations in image's latent space and use causal intervention operations to control image generation. We conduct extensive experiments on a real-world dataset, CelebA. The experimental results illustrate the effectiveness of CCIG.
引用
收藏
页码:135 / 148
页数:14
相关论文
共 50 条
  • [1] Disentangled Representation Learning for Controllable Person Image Generation
    Xu, Wenju
    Long, Chengjiang
    Nie, Yongwei
    Wang, Guanghui
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6065 - 6077
  • [2] SpaText: Spatio-Textual Representation for Controllable Image Generation
    Avrahami, Omri
    Hayes, Thomas
    Gafni, Oran
    Gupta, Sonal
    Taigman, Yaniv
    Parikh, Devi
    Lischinski, Dani
    Fried, Ohad
    Yin, Xi
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18370 - 18380
  • [3] Diffusion-Based Causal Representation Learning
    Mamaghan, Amir Mohammad Karimi
    Dittadi, Andrea
    Bauer, Stefan
    Johansson, Karl Henrik
    Quinzan, Francesco
    ENTROPY, 2024, 26 (07)
  • [4] Image Generation for Printed Character by Representation Learning
    Gu, Kangzheng
    Bai, Jiansong
    Zhang, Qichen
    Peng, Junjie
    Zhang, Wenqiang
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 651 - 660
  • [5] A Causal Lens for Controllable Text Generation
    Hu, Zhiting
    Li, Li Erran
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Causal Invariant Representation Learning Based on Style Intervention Identity Regularization for Remote Sensing Image
    Zhang, Yunsheng
    Liu, Fanfan
    Zhang, Jia
    Li, Haifeng
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2025, 22
  • [7] Disentangled Representation Learning for Controllable Image Synthesis: an Information-Theoretic Perspective
    Tang, Shichang
    Zhou, Xu
    He, Xuming
    Ma, Yi
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10042 - 10049
  • [8] Controllable image generation and manipulation
    Patras, Ioannis
    PROCEEDINGS OF THE 2ND ACM INTERNATIONAL WORKSHOP ON MULTIMEDIA AI AGAINST DISCRIMINATION, MAD 2023, 2023, : 1 - 1
  • [9] Toward Causal Representation Learning
    Schoelkopf, Bernhard
    Locatello, Francesco
    Bauer, Stefan
    Ke, Nan Rosemary
    Kalchbrenner, Nal
    Goyal, Anirudh
    Bengio, Yoshua
    PROCEEDINGS OF THE IEEE, 2021, 109 (05) : 612 - 634
  • [10] Interventional Causal Representation Learning
    Ahuja, Kartik
    Mahajan, Divyat
    Wang, Yixin
    Bengio, Yoshua
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 372 - 407