MSAFusionNet: Multiple Subspace Attention Based Deep Multi-modal Fusion Network

被引:4
|
作者
Zhang, Sen [1 ]
Zhang, Changzheng [1 ]
Wang, Lanjun [2 ]
Li, Cixing [1 ]
Tu, Dandan [1 ]
Luo, Rui [3 ]
Qi, Guojun [3 ]
Luo, Jiebo [4 ]
机构
[1] Huawei, Shenzhen, Peoples R China
[2] Huawei Canada, Markham, ON, Canada
[3] Futurewei, Bellevue, WA USA
[4] Univ Rochester, Rochester, NY 14627 USA
关键词
Deep learning; Multi-modal learning; Segmentation;
D O I
10.1007/978-3-030-32692-0_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is common for doctors to simultaneously consider multi-modal information in diagnosis. However, how to use multi-modal medical images effectively has not been fully studied in the field of deep learning within such a context. In this paper, we address the task of end-to-end segmentation based on multi-modal data and propose a novel deep learning framework, multiple subspace attention-based deep multi-modal fusion network (referred to as MSAFusionNet hereon-forth). More specifically, MSAFusionNet consists of three main components: (1) a multiple subspace attention model that contains inter-attention modules and generalized squeeze-and-excitation modules, (2) a multi-modal fusion network which leverages CNN-LSTM layers to integrate sequential multi-modal input images, and (3) a densely-dilated U-Net as the encoder-decoder backbone for image segmentation. Experiments on ISLES 2018 data set have shown that MSAFusionNet achieves the state-of-the-art segmentation accuracy.
引用
收藏
页码:54 / 62
页数:9
相关论文
共 50 条
  • [11] Pedestrian Facial Attention Detection Using Deep Fusion and Multi-Modal Fusion Classifier
    Lian, Jing
    Wang, Zhenghao
    Yang, Dongfang
    Zheng, Wen
    Li, Linhui
    Zhang, Yibin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 967 - 980
  • [12] Pedestrian Facial Attention Detection Using Deep Fusion and Multi-modal Fusion Classifier
    Lian, Jing
    Wang, Zhenghao
    Yang, Dongfang
    Zheng, Wen
    Li, Linhui
    Zhang, Yibin
    IEEE Transactions on Circuits and Systems for Video Technology, 2024,
  • [13] Deep Convolutional Neural Network for Multi-Modal Image Restoration and Fusion
    Deng, Xin
    Dragotti, Pier Luigi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (10) : 3333 - 3348
  • [14] Deep unsupervised multi-modal fusion network for detecting driver distraction
    Zhang, Yuxin
    Chen, Yiqiang
    Gao, Chenlong
    NEUROCOMPUTING, 2021, 421 : 26 - 38
  • [15] Deep unsupervised multi-modal fusion network for detecting driver distraction
    Zhang Y.
    Chen Y.
    Gao C.
    Neurocomputing, 2021, 421 : 26 - 38
  • [16] CMAF-Net: a cross-modal attention fusion-based deep neural network for incomplete multi-modal brain tumor segmentation
    Sun, Kangkang
    Ding, Jiangyi
    Li, Qixuan
    Chen, Wei
    Zhang, Heng
    Sun, Jiawei
    Jiao, Zhuqing
    Ni, Xinye
    QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2024, 14 (07) : 4579 - 4604
  • [17] Multi-Modal Electrophysiological Source Imaging With Attention Neural Networks Based on Deep Fusion of EEG and MEG
    Jiao, Meng
    Yang, Shihao
    Xian, Xiaochen
    Fotedar, Neel
    Liu, Feng
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2024, 32 : 2492 - 2502
  • [18] ATTENTION DRIVEN FUSION FOR MULTI-MODAL EMOTION RECOGNITION
    Priyasad, Darshana
    Fernando, Tharindu
    Denman, Simon
    Sridharan, Sridha
    Fookes, Clinton
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3227 - 3231
  • [19] Mixture of Attention Variants for Modal Fusion in Multi-Modal Sentiment Analysis
    He, Chao
    Zhang, Xinghua
    Song, Dongqing
    Shen, Yingshan
    Mao, Chengjie
    Wen, Huosheng
    Zhu, Dingju
    Cai, Lihua
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (02)
  • [20] Multi-Modal Image Fusion via Deep Laplacian Pyramid Hybrid Network
    Luo, Xing
    Fu, Guizhong
    Yang, Jiangxin
    Cao, Yanlong
    Cao, Yanpeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7354 - 7369