Decoupled contrastive learning for multilingual multimodal medical pre-trained model

被引：0

作者：

Li, Qiyuan ^{[1
,2
,3
,4
]}

Qiu, Chen ^{[1
,2
,3
,4
]}

Liu, Haijiang ^{[1
,2
,3
,4
]}

Gu, Jinguang ^{[1
,2
,3
,4
]}

Luo, Dan ^{[5
]}

机构：

[1] Wuhan Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430065, Hubei, Peoples R China

[2] Hubei Prov Key Lab Intelligent Informat Proc & Rea, Wuhan 430065, Hubei, Peoples R China

[3] Inst Sci & Tech Informat China, Key Lab Rich Media Knowledge Org, Beijing 100038, Peoples R China

[4] Inst Sci & Tech Informat China, Serv Digital Publishing Content, Beijing 100038, Peoples R China

[5] Lehigh Univ, Dept Comp Sci & Engn, Bethlehem, PA 18015 USA

来源：

NEUROCOMPUTING | 2025年 / 633卷

关键词：

Multilingual multimodal learning; Decoupled contrastive learning; Medical pre-training model;

D O I：

10.1016/j.neucom.2025.129809

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multilingual multimodal pre-training aims to facilitate the integration of conceptual representations across diverse languages and modalities within a shared, high-dimensional semantic space. This endeavor in healthcare faces challenges related to language diversity, suboptimal multimodal interactions, and an absence of coherent multilingual multimodal representations. In response to these challenges, we introduce a novel multilingual multimodal medical pre-training model. Initially, we employ a strategic augmentation of the medical corpus by expanding the MIMIC-CXR report dataset to 20 distinct languages using machine translation techniques. Subsequently, we develop a targeted label disambiguation technique to address the labeling noise within decoupled contrastive learning. In particular, it categorizes and refines uncertain phrases within the clinical reports based on disease type, promoting finer-grained semantic similarity and improving inter- modality interactions. Building on these proposals, we present a refined multilingual multimodal medical pre-trained model, significantly enhancing the understanding of medical multimodal data and adapting the model to multilingual medical contexts. Experiments reveal that our model outperforms other baselines in medical image classification and multilingual medical image-text retrieval by up to 13.78% and 12.6%, respectively.

引用

页数：17

共 50 条

[1] Assessing Multilingual Fairness in Pre-trained Multimodal Representations
Wang, Jialu
Liu, Yang
Wang, Xin Eric
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2681 - 2695
[2] Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
An, Jieyu
Zainon, Wan Mohd Nazmee Wan
Ding, Binfen
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 1673 - 1689
[3] Syntax-guided Contrastive Learning for Pre-trained Language Model
Zhang, Shuai
Wang, Lijie
Xiao, Xinyan
Wu, Hua
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2430 - 2440
[4] Clinical diagnosis normalization based on contrastive learning and pre-trained model
Liu Y.
Cui B.
Cao L.
Cheng L.
Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2024, 52 (05): : 23 - 28
[5] Federated Learning from Pre-Trained Models: A Contrastive Learning Approach
Tan, Yue
Long, Guodong
Ma, Jie
Liu, Lu
Zhou, Tianyi
Jiang, Jing
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[6] Pre-trained Online Contrastive Learning for Insurance Fraud Detection
Zhang, Rui
Cheng, Dawei
Yang, Jie
Ouyang, Yi
Wu, Xian
Zheng, Yefeng
Jiang, Changjun
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22511 - 22519
[7] AWEncoder: Adversarial Watermarking Pre-Trained Encoders in Contrastive Learning
Zhang, Tianxing
Wu, Hanzhou
Lu, Xiaofeng
Han, Gengle
Sun, Guangling
APPLIED SCIENCES-BASEL, 2023, 13 (06):
[8] CATE: A Contrastive Pre-trained Model for Metaphor Detection with Semi-supervised Learning
Lin, Zhenxi
Ma, Qianli
Yan, Jiangyue
Chen, Jieyu
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3888 - 3898
[9] Manipulating Pre-Trained Encoder for Targeted Poisoning Attacks in Contrastive Learning
Chen, Jian
Gao, Yuan
Liu, Gaoyang
Abdelmoniem, Ahmed M.
Wang, Chen
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 2412 - 2424
[10] IPES: Improved Pre-trained Encoder Stealing Attack in Contrastive Learning
Zhang, Chuan
Li, Zhuopeng
Liang, Haotian
Liang, Jinwen
Liu, Ximeng
Zhu, Liehuang
2023 IEEE INTERNATIONAL CONFERENCES ON INTERNET OF THINGS, ITHINGS IEEE GREEN COMPUTING AND COMMUNICATIONS, GREENCOM IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING, CPSCOM IEEE SMART DATA, SMARTDATA AND IEEE CONGRESS ON CYBERMATICS,CYBERMATICS, 2024, : 354 - 361

← 1 2 3 4 5 →