Controllable image generation based on causal representation learning

被引:2
|
作者
Huang, Shanshan [1 ]
Wang, Yuanhao [1 ]
Gong, Zhili [1 ]
Liao, Jun [1 ]
Wang, Shu [2 ]
Liu, Li [1 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 401331, Peoples R China
[2] Southwest Univ, Sch Mat & Energy, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
Image generation; Controllable image editing; Causal structure learning; Causal representation learning; MODEL;
D O I
10.1631/FITEE.2300303
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Artificial intelligence generated content (AIGC) has emerged as an indispensable tool for producing large-scale content in various forms, such as images, thanks to the significant role that AI plays in imitation and production. However, interpretability and controllability remain challenges. Existing AI methods often face challenges in producing images that are both flexible and controllable while considering causal relationships within the images. To address this issue, we have developed a novel method for causal controllable image generation (CCIG) that combines causal representation learning with bi-directional generative adversarial networks (GANs). This approach enables humans to control image attributes while considering the rationality and interpretability of the generated images and also allows for the generation of counterfactual images. The key of our approach, CCIG, lies in the use of a causal structure learning module to learn the causal relationships between image attributes and joint optimization with the encoder, generator, and joint discriminator in the image generation module. By doing so, we can learn causal representations in image's latent space and use causal intervention operations to control image generation. We conduct extensive experiments on a real-world dataset, CelebA. The experimental results illustrate the effectiveness of CCIG.
引用
收藏
页码:135 / 148
页数:14
相关论文
共 50 条
  • [21] Coupled Learning for Image Generation and Latent Representation Inference Using MMD
    Qian, Sheng
    Cao, Wen-ming
    Li, Rui
    Wu, Si
    Wong, Hau-san
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2018, PT II, 2018, 11165 : 430 - 440
  • [22] Image Content Generation with Causal Reasoning
    Li, Xiaochuan
    Fan, Baoyu
    Zhang, Runze
    Jin, Liang
    Wang, Di
    Guo, Zhenhua
    Zhao, Yaqian
    Li, Rengang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13646 - 13654
  • [23] Learning reference-based representation for image categorization
    Li, Qun
    Zhang, Honggang
    Guo, Jun
    Bhanu, Bir
    Journal of Information and Computational Science, 2012, 9 (15): : 4261 - 4269
  • [24] Learning Token-Based Representation for Image Retrieval
    Wu, Hui
    Wang, Min
    Zhou, Wengang
    Hu, Yang
    Li, Houqiang
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 2703 - 2711
  • [25] Research on Image Fusion Technology Based on Representation Learning
    Chen, Zhanwei
    AGRO FOOD INDUSTRY HI-TECH, 2017, 28 (03): : 288 - 292
  • [26] Learning Image Representation Based on Convolutional Neural Networks
    Yang, Zhanbo
    Hu, Fei
    Wang, Jingyuan
    Zhang, Jinjing
    Li, Li
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 642 - 652
  • [27] Precision controllable mesh generation for boundary representation model
    Zeng Z.
    Jia X.
    Xin S.
    Yan D.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (02): : 257 - 267
  • [28] Image Generation Method for Cognizing Image Attribute Features from the Perspective of Disentangled Representation Learning
    Cai, Jianghai
    Huang, Chengquan
    Wang, Shunxia
    Luo, Senyan
    Yang, Guiyan
    Zhou, Lihua
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2024, 37 (07): : 638 - 651
  • [29] Learning Controllable ISP for Image Enhancement
    Kim, Heewon
    Lee, Kyoung Mu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 867 - 880
  • [30] Deep Learning based Image Description Generation
    Kinghorn, Philip
    Zhang, Li
    Shao, Ling
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 919 - 926