Multi-grained Cross-Modal Feature Fusion Network for Diagnosis Prediction

被引:0
|
作者
An, Ying [1 ]
Zhao, Zhenrui [2 ]
Chen, Xianlai [1 ]
机构
[1] Cent South Univ, Big Data Inst, Changsha, Peoples R China
[2] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
基金
中国国家自然科学基金;
关键词
Electronic Health Records; Multimodal Fusion; Diagnosis Prediction;
D O I
10.1007/978-981-97-5131-0_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Electronic Health Record (EHR) contains a wealth of data from multiple modalities. Utilizing these data to comprehensively reflect changes in patients' conditions and accurately predict their diseases is an important research issue in the medical field. However, most fusion approaches employed in existing multimodal learning studies are excessively simplistic and often neglect the hierarchical nature of intermodal interactions. In this paper, we propose a novel multi-grained cross-modal feature fusion network. In this model, we first use hierarchical encoders to learn multilevel representations of multimodal data and a specially designed attention mechanism to explore hierarchical relationships within a single modality. Afterward, we construct a fine-grained cross-modal clinical semantic relationship graph between code and sentence representations. Then we employ Graph Convolutional Networks (GCN) on this graph to achieve fine-grained feature fusion. Finally, we use attention mechanisms to fully learn the contextual interactions between visit-level multimodal representations, and realize coarsegrained feature fusion. We evaluate our model on two real-world clinical datasets, and the experimental results validate the effectiveness of our model.
引用
收藏
页码:221 / 232
页数:12
相关论文
共 50 条
  • [31] Cross-Modal Fine-Grained Interaction Fusion in Fake News Detection
    Che, Zhanbin
    Cui, GuangBo
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (05) : 945 - 956
  • [32] Estimation of Pig Weight Based on Cross-modal Feature Fusion Model
    He W.
    Mi Y.
    Liu G.
    Ding X.
    Li T.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2023, 54 : 275 - 282and329
  • [33] Cross-Modal Hybrid Feature Fusion for Image-Sentence Matching
    Xu, Xing
    Wang, Yifan
    He, Yixuan
    Yang, Yang
    Hanjalic, Alan
    Shen, Heng Tao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (04)
  • [34] Cross-modal misalignment-robust feature fusion for crowd counting
    Kong, Weihang
    Yu, Zepeng
    Li, He
    Zhang, Junge
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 136
  • [35] Heterogeneous Feature Fusion and Cross-modal Alignment for Composed Image Retrieval
    Zhang, Gangjian
    Wei, Shikui
    Pang, Huaxin
    Zhao, Yao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5353 - 5362
  • [36] A Cross-Modal Correlation Fusion Network for Emotion Recognition in Conversations
    Tang, Xiaolyu
    Cai, Guoyong
    Chen, Ming
    Yuan, Peicong
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT V, NLPCC 2024, 2025, 15363 : 55 - 68
  • [37] Triplet Fusion Network Hashing for Unpaired Cross-Modal Retrieval
    Hu, Zhikai
    Liu, Xin
    Wang, Xingzhi
    Cheung, Yiu-ming
    Wang, Nannan
    Chen, Yewang
    ICMR'19: PROCEEDINGS OF THE 2019 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2019, : 141 - 149
  • [38] Cross-modal evidential fusion network for social media classification
    Yu, Chen
    Wang, Zhiguo
    COMPUTER SPEECH AND LANGUAGE, 2025, 92
  • [39] Bilateral Cross-Modal Fusion Network for Robot Grasp Detection
    Zhang, Qiang
    Sun, Xueying
    SENSORS, 2023, 23 (06)
  • [40] Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation
    Zhang, Pan
    Chen, Ming
    Gao, Meng
    SENSORS, 2024, 24 (08)