Aligning Medical Images with General Knowledge from Large Language Models

被引:0
|
作者
Fang, Xiao [1 ]
Lin, Yi [1 ]
Zhang, Dong [2 ]
Cheng, Kwang-Ting [2 ]
Chen, Hao [1 ,3 ,4 ]
机构
[1] HKUST, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] HKUST, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[3] HKUST, Dept Chem & Biol Engn, Hong Kong, Peoples R China
[4] HKUST Shenzhen Hong Kong Collaborat Innovat Res I, Shenzhen, Peoples R China
关键词
Prompt Learning; Vision-Language Models; Large Language Model; Medical Image Analysis;
D O I
10.1007/978-3-031-72117-5_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained large vision-language models (VLMs) like CLIP have revolutionized visual representation learning using natural language as supervisions, and demonstrated promising generalization ability. In this work, we propose ViP, a novel visual symptom-guided prompt learning framework for medical image analysis, which facilitates general knowledge transfer from CLIP. ViP consists of two key components: a visual symptom generator (VSG) and a dual-prompt network. Specifically, VSG aims to extract explicable visual symptoms from pre-trained large language models, while the dual-prompt network utilizes these visual symptoms to guide the training on two learnable prompt modules, i.e., context prompt and merge prompt, which effectively adapts our framework to medical image analysis via large VLMs. Extensive experimental results demonstrate that ViP can outperform state-of-the-art methods on two challenging datasets. The code is available at https://github.com/xiaofang007/ViP.
引用
收藏
页码:57 / 67
页数:11
相关论文
共 50 条
  • [21] Do large language models "understand" their knowledge?
    Venkatasubramanian, Venkat
    AICHE JOURNAL, 2025, 71 (03)
  • [22] Large Language Models as General Pattern Machines
    Mirchandani, Suvir
    Xia, Fei
    Florence, Pete
    Ichter, Brian
    Driess, Danny
    Arenas, Montserrat Gonzalez
    Rao, Kanishka
    Sadigh, Dorsa
    Zeng, Andy
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [23] Evaluating Intelligence and Knowledge in Large Language Models
    Bianchini, Francesco
    TOPOI-AN INTERNATIONAL REVIEW OF PHILOSOPHY, 2025, 44 (01): : 163 - 173
  • [24] Statistical Knowledge Assessment for Large Language Models
    Dong, Qingxiu
    Xu, Jingjing
    Kong, Lingpeng
    Sui, Zhifang
    Li, Lei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [25] Knowledge Editing for Large Language Models: A Survey
    Wang, Song
    Zhu, Yaochen
    Liu, Haochen
    Zheng, Zaiyi
    Chen, Chen
    Li, Jundong
    ACM COMPUTING SURVEYS, 2025, 57 (03)
  • [26] Comparing the dental knowledge of large language models
    Tussie, Camila
    Starosta, Abraham
    BRITISH DENTAL JOURNAL, 2024,
  • [27] Aligning large language models with radiologists by reinforcement learning from AI feedback for chest CT reports
    Yang, Lingrui
    Zhou, Yuxing
    Qi, Jun
    Zhen, Xiantong
    Sun, Li
    Shi, Shan
    Su, Qinghua
    Yang, Xuedong
    EUROPEAN JOURNAL OF RADIOLOGY, 2025, 184
  • [28] Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Trustworthy Response Generation in Chinese
    Wang, Haochun
    Zhao, Sendong
    Qiang, Zewen
    Li, Zijian
    Liu, Chi
    Xi, Nuwa
    Du, Yanrui
    Qin, Bing
    Liu, Ting
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2025, 19 (02)
  • [29] Utility of word embeddings from large language models in medical diagnosis
    Yazdani, Shahram
    Henry, Ronald Claude
    Byrne, Avery
    Henry, Isaac Claude
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2025, 32 (03) : 526 - 534
  • [30] Large Language Models and the Degradation of the Medical Record
    McCoy, Liam G.
    Manrai, Arjun K.
    Rodman, Adam
    NEW ENGLAND JOURNAL OF MEDICINE, 2024, 391 (17): : 1561 - 1564