Optimized biomedical entity relation extraction method with data augmentation and classification using GPT-4 and Gemini

被引:0
|
作者
Phan, Cong-Phuoc [1 ]
Phan, Ben [1 ]
Chiang, Jung-Hsien [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, 1 Univ Rd, Tainan 701, Taiwan
关键词
D O I
10.1093/database/baae104
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Despite numerous research efforts by teams participating in the BioCreative VIII Track 01 employing various techniques to achieve the high accuracy of biomedical relation tasks, the overall performance in this area still has substantial room for improvement. Large language models bring a new opportunity to improve the performance of existing techniques in natural language processing tasks. This paper presents our improved method for relation extraction, which involves integrating two renowned large language models: Gemini and GPT-4. Our new approach utilizes GPT-4 to generate augmented data for training, followed by an ensemble learning technique to combine the outputs of diverse models to create a more precise prediction. We then employ a method using Gemini responses as input to fine-tune the BioNLP-PubMed-Bert classification model, which leads to improved performance as measured by precision, recall, and F1 scores on the same test dataset used in the challenge evaluation.Database URL: https://biocreative.bioinformatics.udel.edu/tasks/biocreative-viii/track-1/
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Entity relation extraction in the medical domain: based on data augmentation
    Wang, Anli
    Li, Linyi
    Wu, Xuehong
    Zhu, Jianping
    Yu, Shanshan
    Chen, Xi
    Li, Jianhua
    Zhu, Hongtao
    ANNALS OF TRANSLATIONAL MEDICINE, 2022, 10 (19)
  • [2] Sentiment Analysis on GPT-4 with Comparative Models Using Twitter Data
    Ozel, Mustafa
    Bozkurt, Ozlem Cetinkaya
    ACTA INFOLOGICA, 2024, 8 (01): : 23 - 33
  • [3] GPT-4 as an X data annotator: Unraveling its performance on a stance classification task
    Liyanage, Chandreen R.
    Gokani, Ravi
    Mago, Vijay
    PLOS ONE, 2024, 19 (08):
  • [4] Few-shot biomedical relation extraction using data augmentation and domain information
    Guo, Bocheng
    Zhao, Di
    Dong, Xin
    Meng, Jiana
    Lin, Hongfei
    NEUROCOMPUTING, 2024, 595
  • [5] Exploring the Feasibility of GPT-4 as a Data Extraction Tool for Renal Surgery Operative Notes
    Hsueh, Jessica Y.
    Nethala, Daniel
    Singh, Shiva
    Hyman, Jason A.
    Gelikman, David G.
    Linehan, W. Marston
    Ball, Mark W.
    UROLOGY PRACTICE, 2024, 11 (05)
  • [6] Classification performance and reproducibility of GPT-4 omni for information extraction from veterinary electronic health records
    Wulcan, Judit M.
    Jacques, Kevin L.
    Lee, Mary Ann
    Kovacs, Samantha L.
    Dausend, Nicole
    Prince, Lauren E.
    Wulcan, Jonatan
    Marsilio, Sina
    Keller, Stefan M.
    FRONTIERS IN VETERINARY SCIENCE, 2025, 11
  • [7] Zero-Shot Building Age Classification from Facade Image Using GPT-4
    Zeng, Zichao
    Goo, June Moh
    Wang, Xinglei
    Chi, Bin
    Wang, Meihui
    Boehm, Jan
    MID-TERM SYMPOSIUM THE ROLE OF PHOTOGRAMMETRY FOR A SUSTAINABLE WORLD, VOL. 48-2, 2024, : 457 - 464
  • [8] Enhancing biomedical named entity classification using terabyte unlabeled data
    Li, Yanpeng
    Lin, Hongfei
    Yang, Zhihao
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 605 - 612
  • [9] Evaluating the GPT-3.5 and GPT-4 Large Language Models for Zero-Shot Classification of South African Violent Event Data
    Kotze, Eduan
    Senekal, Burgert A.
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, BIG DATA, COMPUTING AND DATA COMMUNICATION SYSTEMS, ICABCD 2024, 2024,
  • [10] Preliminary assessment of TNM classification performance for pancreatic cancer in Japanese radiology reports using GPT-4
    Suzuki, Kazufumi
    Yamada, Hiroki
    Yamazaki, Hiroshi
    Honda, Goro
    Sakai, Shuji
    JAPANESE JOURNAL OF RADIOLOGY, 2025, 43 (01) : 51 - 55