Controllable Mind Visual Diffusion Model

被引:0
|
作者
Zeng, Bohan [1 ]
Li, Shanglin [1 ]
Liu, Xuhui [1 ]
Gao, Sicheng [1 ]
Jiang, Xiaolong [3 ]
Tang, Xu [3 ]
Hu, Yao [3 ]
Liu, Jianzhuang [4 ]
Zhang, Baochang [1 ,2 ,5 ]
机构
[1] Beihang Univ, Hangzhou Res Inst, Inst Artificial Intelligence, Hangzhou, Peoples R China
[2] Nanchang Inst Technol, Nanchang, Jiangxi, Peoples R China
[3] Xiaohongshu Inc, Shanghai, Peoples R China
[4] Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
[5] Zhongguancun Lab, Beijing, Peoples R China
来源
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7 | 2024年
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
BRAIN; RECONSTRUCTION; IMAGES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models. Diffusion-based methods have recently shown promise in analyzing functional magnetic resonance imaging (fMRI) data, including the reconstruction of high-quality images consistent with original visual stimuli. Nonetheless, it remains a critical challenge to effectively harness the semantic and silhouette information extracted from brain signals. In this paper, we propose a novel approach, termed as Controllable Mind Visual Diffusion Model (CMVDM). Specifically, CMVDM first extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks. Then, a control model is introduced in conjunction with a residual block to fully exploit the extracted information for image synthesis, generating high-quality images that closely resemble the original visual stimuli in both semantic content and silhouette characteristics. Through extensive experimentation, we demonstrate that CMVDM outperforms existing state-of-the-art methods both qualitatively and quantitatively. Our code is available at https://github.com/zengbohan0217/CMVDM.
引用
收藏
页码:6935 / 6943
页数:9
相关论文
共 50 条
  • [1] UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
    Qin, Can
    Zhang, Shu
    Yu, Ning
    Feng, Yihao
    Yang, Xinyi
    Zhou, Yingbo
    Wang, Huan
    Niebles, Juan Carlos
    Xiong, Caiming
    Savarese, Silvio
    Ermon, Stefano
    Fu, Yun
    Xu, Ran
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] CLE Diffusion: Controllable Light Enhancement Diffusion Model
    Yin, Yuyang
    Xu, Dejia
    Tan, Chuangchuang
    Liu, Ping
    Zhao, Yao
    Wei, Yunchao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8145 - 8156
  • [3] Mind-bridge: reconstructing visual images based on diffusion model from human brain activity
    Liu, Qing
    Zhu, Hongqing
    Chen, Ning
    Huang, Bingcang
    Lu, Weiping
    Wang, Ying
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (SUPPL 1) : 953 - 963
  • [4] LayoutDM: Discrete Diffusion Model for Controllable Layout Generation
    Inoue, Naoto
    Kikuchi, Kotaro
    Simo-Serra, Edgar
    Otani, Mayu
    Yamaguchi, Kota
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10167 - 10176
  • [5] CAGEN: CONTROLLABLE ANOMALY GENERATOR USING DIFFUSION MODEL
    Jiang, Bolin
    Xie, Yuqiu
    Li, Jiawei
    Li, Naiqi
    Jiang, Yong
    Xia, Shu-Tao
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3110 - 3114
  • [6] THRESHOLD FOR VISUAL FORM - A DIFFUSION MODEL
    BITTERMAN, ME
    KRAUSKOPF, J
    HOCHBERG, JE
    AMERICAN JOURNAL OF PSYCHOLOGY, 1954, 67 (02): : 205 - 219
  • [7] CDRM: Controllable diffusion restoration model for realistic image deblurring
    Chen, Ziyi
    Cui, Guangmang
    Zhao, Jufeng
    Nie, Jiahao
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 275
  • [8] LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation
    Zheng, Guangcong
    Zhou, Xianpan
    Li, Xuewei
    Qi, Zhongang
    Shan, Ying
    Li, Xi
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22490 - 22499
  • [9] Universal Fingerprint Generation: Controllable Diffusion Model With Multimodal Conditions
    Grosz, Steven A.
    Jain, Anil K.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (02) : 1028 - 1041
  • [10] DDP: Diffusion Model for Dense Visual Prediction
    Ji, Yuanfeng
    Chen, Zhe
    Xie, Enze
    Hong, Lanqing
    Liu, Xihui
    Liu, Zhaoqiang
    Lu, Tong
    Li, Zhenguo
    Luo, Ping
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21684 - 21695