Decoupled contrastive learning for multilingual multimodal medical pre-trained model

被引:0
|
作者
Li, Qiyuan [1 ,2 ,3 ,4 ]
Qiu, Chen [1 ,2 ,3 ,4 ]
Liu, Haijiang [1 ,2 ,3 ,4 ]
Gu, Jinguang [1 ,2 ,3 ,4 ]
Luo, Dan [5 ]
机构
[1] Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430065, Hubei, Peoples R China
[2] Hubei Prov Key Lab Intelligent Informat Proc & Rea, Wuhan 430065, Hubei, Peoples R China
[3] Inst Sci & Tech Informat China, Key Lab Rich Media Knowledge Org, Beijing 100038, Peoples R China
[4] Inst Sci & Tech Informat China, Serv Digital Publishing Content, Beijing 100038, Peoples R China
[5] Lehigh Univ, Dept Comp Sci & Engn, Bethlehem, PA 18015 USA
关键词
Multilingual multimodal learning; Decoupled contrastive learning; Medical pre-training model;
D O I
10.1016/j.neucom.2025.129809
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multilingual multimodal pre-training aims to facilitate the integration of conceptual representations across diverse languages and modalities within a shared, high-dimensional semantic space. This endeavor in healthcare faces challenges related to language diversity, suboptimal multimodal interactions, and an absence of coherent multilingual multimodal representations. In response to these challenges, we introduce a novel multilingual multimodal medical pre-training model. Initially, we employ a strategic augmentation of the medical corpus by expanding the MIMIC-CXR report dataset to 20 distinct languages using machine translation techniques. Subsequently, we develop a targeted label disambiguation technique to address the labeling noise within decoupled contrastive learning. In particular, it categorizes and refines uncertain phrases within the clinical reports based on disease type, promoting finer-grained semantic similarity and improving inter- modality interactions. Building on these proposals, we present a refined multilingual multimodal medical pre-trained model, significantly enhancing the understanding of medical multimodal data and adapting the model to multilingual medical contexts. Experiments reveal that our model outperforms other baselines in medical image classification and multilingual medical image-text retrieval by up to 13.78% and 12.6%, respectively.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Assessing Multilingual Fairness in Pre-trained Multimodal Representations
    Wang, Jialu
    Liu, Yang
    Wang, Xin Eric
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2681 - 2695
  • [2] Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
    An, Jieyu
    Zainon, Wan Mohd Nazmee Wan
    Ding, Binfen
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 1673 - 1689
  • [3] Syntax-guided Contrastive Learning for Pre-trained Language Model
    Zhang, Shuai
    Wang, Lijie
    Xiao, Xinyan
    Wu, Hua
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2430 - 2440
  • [4] Clinical diagnosis normalization based on contrastive learning and pre-trained model
    Liu Y.
    Cui B.
    Cao L.
    Cheng L.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2024, 52 (05): : 23 - 28
  • [5] Federated Learning from Pre-Trained Models: A Contrastive Learning Approach
    Tan, Yue
    Long, Guodong
    Ma, Jie
    Liu, Lu
    Zhou, Tianyi
    Jiang, Jing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] Pre-trained Online Contrastive Learning for Insurance Fraud Detection
    Zhang, Rui
    Cheng, Dawei
    Yang, Jie
    Ouyang, Yi
    Wu, Xian
    Zheng, Yefeng
    Jiang, Changjun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22511 - 22519
  • [7] AWEncoder: Adversarial Watermarking Pre-Trained Encoders in Contrastive Learning
    Zhang, Tianxing
    Wu, Hanzhou
    Lu, Xiaofeng
    Han, Gengle
    Sun, Guangling
    APPLIED SCIENCES-BASEL, 2023, 13 (06):
  • [8] CATE: A Contrastive Pre-trained Model for Metaphor Detection with Semi-supervised Learning
    Lin, Zhenxi
    Ma, Qianli
    Yan, Jiangyue
    Chen, Jieyu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3888 - 3898
  • [9] Manipulating Pre-Trained Encoder for Targeted Poisoning Attacks in Contrastive Learning
    Chen, Jian
    Gao, Yuan
    Liu, Gaoyang
    Abdelmoniem, Ahmed M.
    Wang, Chen
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 2412 - 2424
  • [10] IPES: Improved Pre-trained Encoder Stealing Attack in Contrastive Learning
    Zhang, Chuan
    Li, Zhuopeng
    Liang, Haotian
    Liang, Jinwen
    Liu, Ximeng
    Zhu, Liehuang
    2023 IEEE INTERNATIONAL CONFERENCES ON INTERNET OF THINGS, ITHINGS IEEE GREEN COMPUTING AND COMMUNICATIONS, GREENCOM IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING, CPSCOM IEEE SMART DATA, SMARTDATA AND IEEE CONGRESS ON CYBERMATICS,CYBERMATICS, 2024, : 354 - 361