Fine-tuning large language models for improved health communication in low-resource languages

被引:0
|
作者
Bui, Nhat [1 ]
Nguyen, Giang [1 ]
Nguyen, Nguyen [1 ]
Vo, Bao [1 ]
Vo, Luan [1 ]
Huynh, Tom [1 ]
Tang, Arthur [1 ]
Tran, Van Nhiem [2 ]
Huynh, Tuyen [3 ]
Nguyen, Huy Quang [3 ]
Dinh, Minh [1 ]
机构
[1] RMIT Univ, Sch Sci Engn & Technol, Ho Chi Minh City, Vietnam
[2] Hon Hai Res Inst, AI Res Ctr, Taipei 114699, Taiwan
[3] Oxford Univ Clin Res Unit OUCRU, Ho Chi Minh City, Vietnam
关键词
Artificial intelligence; Large language model; Low-resources languages; Health communication and promotion; Data privacy and security; Health equity;
D O I
10.1016/j.cmpb.2025.108655
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: The reported study illustrates a methodology for compiling training datasets to fine-tune Large Language Models (LLMs) for healthcare information in Vietnamese, a low-resource language. The objective is to bridge the gap in medical information accessibility and enhance healthcare communication in developing countries by adapting LLMs to specific linguistic nuances and domain needs. Method: The methodology involves selecting a base model, compiling a domain-specific dataset, and fine-tuning the model with this dataset. Three open-source models were selected. The dataset, comprising approximately 337,000 prompt-response pairs in Vietnamese, was compiled using existing datasets, data crawled from Vietnamese medical online forums, and distilled from Vietnamese medical textbooks. The three models were finetuned using the Low-Rank adaptation (LoRA) and Quantized Low-Rank adaptation (QLoRA) techniques. Models' performances were evaluated using BertScore score, Rouge-L score, and the "LLM-as-a-Judge" method. Results: The fine-tuned models showed enhancements in performance over their base versions across evaluation metrics in BertScore score, Rouge-L score and "LLM-as-a-Judge" method, confirming the effectiveness of the finetuning process. This study details the process of fine-tuning open-source LLMs for health information inquiries in Vietnamese, demonstrating its potential to improve healthcare communication in low-resource languages. Deploying the fine-tuned LLM on-premise enhances data privacy and security. However, the significant computing power and costs required pose challenges, especially for organizations in developing countries. Conclusion: This case study highlights the unique challenges faced by developing countries using low-resource languages. Initiatives are needed to emphasize efforts to bridge healthcare gaps in underserved areas and contribute to global health equity.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Contrastive fine-tuning for low-resource graph-level transfer learning
    Duan, Yutai
    Liu, Jie
    Chen, Shaowei
    Wu, Jianhua
    INFORMATION SCIENCES, 2024, 659
  • [22] Effective Fine-tuning Method for Tibetan Low-resource Dialect Speech Recognition
    Yang, Jiahao
    Wei, Jianguo
    Khysru, Kuntharrgyal
    Xu, Junhai
    Lu, Wenhuan
    Ke, Wenjun
    Yang, Xiaokang
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 960 - 965
  • [23] Two-Stage Fine-Tuning for Improved Bias and Variance for Large Pretrained Language Models
    Wang, Lijing
    Li, Yingya
    Miller, Timothy
    Bethard, Steven
    Savova, Guergana
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15746 - 15761
  • [24] Fine-tuning large neural language models for biomedical natural language processing
    Tinn, Robert
    Cheng, Hao
    Gu, Yu
    Usuyama, Naoto
    Liu, Xiaodong
    Naumann, Tristan
    Gao, Jianfeng
    Poon, Hoifung
    PATTERNS, 2023, 4 (04):
  • [25] Fine-Tuning Self-Supervised Multilingual Sequence-To-Sequence Models for Extremely Low-Resource NMT
    Thillainathan, Sarubi
    Ranathunga, Surangika
    Jayasena, Sanath
    MORATUWA ENGINEERING RESEARCH CONFERENCE (MERCON 2021) / 7TH INTERNATIONAL MULTIDISCIPLINARY ENGINEERING RESEARCH CONFERENCE, 2021, : 432 - 437
  • [26] Fine-Tuning Large Language Models for Private Document Retrieval: A Tutorial
    Sommers, Frank
    Kongthon, Alisa
    Kongyoung, Sarawoot
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1319 - 1320
  • [27] Large language models in Radiology: The importance of fine-tuning and the fable of the luthier
    Martin-Noguerol, Teodoro
    Lopez-Ubeda, Pilar
    Luna, Antonio
    EUROPEAN JOURNAL OF RADIOLOGY, 2024, 178
  • [28] Efficient Fine-Tuning of Large Language Models via a Low-Rank Gradient Estimator
    Zhang, Luoming
    Lou, Zhenyu
    Ying, Yangwei
    Yang, Cheng
    Zhou, Hong
    APPLIED SCIENCES-BASEL, 2025, 15 (01):
  • [29] Distributed Inference and Fine-tuning of Large Language Models Over The Internet
    Borzunov, Alexander
    Ryabinin, Max
    Chumachenko, Artem
    Baranchuk, Dmitry
    Dettmers, Tim
    Belkada, Younes
    Samygin, Pavel
    Raffel, Colin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [30] Fine-Tuning Large Enterprise Language Models via Ontological Reasoning
    Baldazzi, Teodoro
    Bellomarini, Luigi
    Ceri, Stefano
    Colombo, Andrea
    Gentili, Andrea
    Sallinger, Emanuel
    RULES AND REASONING, RULEML+RR 2023, 2023, 14244 : 86 - 94