Optimized biomedical entity relation extraction method with data augmentation and classification using GPT-4 and Gemini

被引:0
|
作者
Phan, Cong-Phuoc [1 ]
Phan, Ben [1 ]
Chiang, Jung-Hsien [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, 1 Univ Rd, Tainan 701, Taiwan
关键词
D O I
10.1093/database/baae104
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Despite numerous research efforts by teams participating in the BioCreative VIII Track 01 employing various techniques to achieve the high accuracy of biomedical relation tasks, the overall performance in this area still has substantial room for improvement. Large language models bring a new opportunity to improve the performance of existing techniques in natural language processing tasks. This paper presents our improved method for relation extraction, which involves integrating two renowned large language models: Gemini and GPT-4. Our new approach utilizes GPT-4 to generate augmented data for training, followed by an ensemble learning technique to combine the outputs of diverse models to create a more precise prediction. We then employ a method using Gemini responses as input to fine-tune the BioNLP-PubMed-Bert classification model, which leads to improved performance as measured by precision, recall, and F1 scores on the same test dataset used in the challenge evaluation.Database URL: https://biocreative.bioinformatics.udel.edu/tasks/biocreative-viii/track-1/
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Improving accuracy of GPT-3/4 results on biomedical data using a retrieval-augmented language model
    Soong, David
    Sridhar, Sriram
    Si, Han
    Wagner, Jan-Samuel
    Sa, Ana Caroline Costa
    Yu, Christina Y.
    Karagoz, Kubra
    Guan, Meijian
    Kumar, Sanyam
    Hamadeh, Hisham
    Higgs, Brandon W.
    PLOS DIGITAL HEALTH, 2024, 3 (08):
  • [32] Automated Late Fusion of Low Level Descriptors for Feature Extraction and Texture Classification Using Data Augmentation
    Hazgui, Mohamed
    Ghazouani, Haythem
    Barhoumi, Walid
    RECENT CHALLENGES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, 2022, 1716 : 147 - 162
  • [33] Feature Extraction and Classification Using Leading Eigenvectors: Applications to Biomedical and Multi-Modal mHealth Data
    Cosma, Georgina
    Mcginnity, T. Martin
    IEEE ACCESS, 2019, 7 : 107400 - 107412
  • [34] Feasibility of Multimodal Artificial Intelligence Using GPT-4 Vision for the Classification of Middle Ear Disease: Qualitative Study and Validation (vol 3, e58342, 2024)
    Noda, Masao
    Yoshimura, Hidekane
    Okubo, Takuya
    Koshu, Ryota
    Uchiyama, Yuki
    Nomura, Akihiro
    Ito, Makoto
    Takumi, Yutaka
    JMIR AI, 2024, 3
  • [35] Document-Level Biomedical Relation Extraction Leveraging Pretrained Self-Attention Structure and Entity Replacement: Algorithm and Pretreatment Method Validation Study
    Liu, Xiaofeng
    Fan, Jianye
    Dong, Shoubin
    JMIR MEDICAL INFORMATICS, 2020, 8 (05)
  • [36] Optimized Gabor Feature Extraction for Mass Classification Using Cuckoo Search for Big Data E-Healthcare
    Khan, Salabat
    Khan, Amir
    Maqsood, Muazzam
    Aadil, Farhan
    Ghazanfar, Mustansar Ali
    JOURNAL OF GRID COMPUTING, 2019, 17 (02) : 239 - 254
  • [37] Optimized Gabor Feature Extraction for Mass Classification Using Cuckoo Search for Big Data E-Healthcare
    Salabat Khan
    Amir Khan
    Muazzam Maqsood
    Farhan Aadil
    Mustansar Ali Ghazanfar
    Journal of Grid Computing, 2019, 17 : 239 - 254
  • [38] An Interpretable Experimental Data Augmentation Method to Improve Knee Health Classification Using Joint Acoustic Emissions
    Goktug C. Ozmen
    Asim H. Gazi
    Sevda Gharehbaghi
    Kristine L. Richardson
    Mohsen Safaei
    Daniel C. Whittingslow
    Sampath Prahalad
    Jennifer L. Hunnicutt
    John W. Xerogeanes
    Teresa K. Snow
    Omer T. Inan
    Annals of Biomedical Engineering, 2021, 49 : 2399 - 2411
  • [39] An Interpretable Experimental Data Augmentation Method to Improve Knee Health Classification Using Joint Acoustic Emissions
    Ozmen, Goktug C.
    Gazi, Asim H.
    Gharehbaghi, Sevda
    Richardson, Kristine L.
    Safaei, Mohsen
    Whittingslow, Daniel C.
    Prahalad, Sampath
    Hunnicutt, Jennifer L.
    Xerogeanes, John W.
    Snow, Teresa K.
    Inan, Omer T.
    ANNALS OF BIOMEDICAL ENGINEERING, 2021, 49 (09) : 2399 - 2411
  • [40] A Bayesian classification of biomedical images using feature extraction from deep neural networks implemented on lung cancer data
    Antonio, V. A. A.
    Ono, N.
    Go, Clark Kendrick C.
    HUMAN GENOMICS, 2016, 10