CMFA_Net: A cross-modal feature aggregation network for infrared-visible image fusion

被引:15
|
作者
Ding, Zhaisheng [1 ]
Li, Haiyan [1 ]
Zhou, Dongming [1 ]
Li, Hongsong [1 ]
Liu, Yanyu [1 ]
Hou, Ruichao [2 ]
机构
[1] Yunnan Univ, Sch Informat, Kunming 650504, Yunnan, Peoples R China
[2] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal; Attention mechanism; Image fusion; Unsupervised learning; End-to-end network; Infrared-visible images; PERFORMANCE; FRAMEWORK;
D O I
10.1016/j.infrared.2021.103905
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
Infrared and visible image fusion is a typical cross-modal information enhancement technology, which aims to fetch the complementary cues from different sensors to reconstruct an informative image or video. Many related works focus on designing the hand-crafted fusion rules and ignore the inner complementarity potentials of modalities, resulting in failing to mine the ability of the deep model thoroughly. In this work, an unsupervised cross-modal feature aggregation network (CMFA_Net) is developed, which explores the latent correlations between the internal characteristics effectively and processes these information to fuse a satisfactory image. Firstly, a densely integrated structure and an attention module are proposed to form a feature extractor. Subsequently, the l1 -norm and the attention mechanism are combined to fuse the affinity features of the cross-modal images. Finally, the fused image is reconstructed by the deconvolution block. To guarantee the clarity and rich information of the fused image, a specific loss function is put forward by utilizing the average pixel decision for structural similarity (SSIM-p) and content-gram variation (CGV) for training the model on the KAIST dataset. Extensive and solid experiments verify the effectiveness and robustness of the proposed model and demonstrate that the proposed method outperforms the state-of-the-arts and achieves advanced performance as well as less computational consumption both in subjective and objective evaluations.
引用
收藏
页数:13
相关论文
共 50 条
  • [11] TCTFusion: A Triple-Branch Cross-Modal Transformer for Adaptive Infrared and Visible Image Fusion
    Zhang, Liang
    Jiang, Yueqiu
    Yang, Wei
    Liu, Bo
    ELECTRONICS, 2025, 14 (04):
  • [12] CMFuse: Cross-Modal Features Mixing via Convolution and MLP for Infrared and Visible Image Fusion
    Cai, Zhao
    Ma, Yong
    Huang, Jun
    Mei, Xiaoguang
    Fan, Fan
    Zhao, Zhiqing
    IEEE SENSORS JOURNAL, 2024, 24 (15) : 24152 - 24167
  • [13] CMFFN: An efficient cross-modal feature fusion network for semantic
    Zhang, Yingjian
    Li, Ning
    Jiao, Jichao
    Ai, Jiawen
    Yan, Zheng
    Zeng, Yingchao
    Zhang, Tianxiang
    Li, Qian
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2025, 186
  • [14] Dual-Attention-Based Feature Aggregation Network for Infrared and Visible Image Fusion
    Tang, Zhimin
    Xiao, Guobao
    Guo, Junwen
    Wang, Shiping
    Ma, Jiayi
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [15] CEFusion: An Infrared and Visible Image Fusion Network Based on Cross-Modal Multi-Granularity Information Interaction and Edge Guidance
    Yang, Bin
    Hu, Yuxuan
    Liu, Xiaowen
    Li, Jing
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 17794 - 17809
  • [16] Infrared and Visible Cross-Modal Image Retrieval Through Shared Features
    Liu, Fangcen
    Gao, Chenqiang
    Sun, Yongqing
    Zhao, Yue
    Yang, Feng
    Qin, Anyong
    Meng, Deyu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (11) : 4485 - 4496
  • [17] Infrared-Visible Image Fusion through Feature-Based Decomposition and Domain Normalization
    Chen, Weiyi
    Miao, Lingjuan
    Wang, Yuhao
    Zhou, Zhiqiang
    Qiao, Yajun
    REMOTE SENSING, 2024, 16 (06)
  • [18] Cross-Modal Hybrid Feature Fusion for Image-Sentence Matching
    Xu, Xing
    Wang, Yifan
    He, Yixuan
    Yang, Yang
    Hanjalic, Alan
    Shen, Heng Tao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (04)
  • [19] Heterogeneous Feature Fusion and Cross-modal Alignment for Composed Image Retrieval
    Zhang, Gangjian
    Wei, Shikui
    Pang, Huaxin
    Zhao, Yao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5353 - 5362
  • [20] A Contrastive Learning Approach for Infrared-Visible Image Fusion
    Gupta, Ashish Kumar
    Barnwal, Meghna
    Mishra, Deepak
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2023, 2023, 14301 : 199 - 208