MLEC-QA: A Chinese Multi-Choice Biomedical Question Answering Dataset

被引:0
|
作者
Li, Jing [1 ]
Zhong, Shangping [1 ]
Chen, Kaizhi [1 ]
机构
[1] Fuzhou Univ, Coll Comp & Data Sci, Fuzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
SYSTEM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question Answering (QA) has been successfully applied in scenarios of human-computer interaction such as chatbots and search engines. However, for the specific biomedical domain, QA systems are still immature due to expert-annotated datasets being limited by category and scale. In this paper, we present MLEC-QA, the largest-scale Chinese multi-choice biomedical QA dataset, collected from the National Medical Licensing Examination in China. The dataset is composed of five subsets with 136,236 biomedical multi-choice questions with extra materials (images or tables) annotated by human experts, and first covers the following biomedical sub-fields: Clinic, Stomatology, Public Health, Traditional Chinese Medicine, and Traditional Chinese Medicine Combined with Western Medicine. We implement eight representative control methods and open-domain QA methods as baselines. Experimental results demonstrate that even the current best model can only achieve accuracies between 40% to 55% on five subsets, especially performing poorly on questions that require sophisticated reasoning ability. We hope the release of the MLEC-QA dataset can serve as a valuable resource for research and evaluation in open-domain QA, and also make advances for biomedical QA systems.(1)
引用
收藏
页码:8862 / 8874
页数:13
相关论文
共 50 条
  • [31] QA4QG: USING QUESTION ANSWERING TO CONSTRAIN MULTI-HOP QUESTION GENERATION
    Su, Dan
    Xu, Peng
    Fung, Pascale
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8232 - 8236
  • [32] IM-TQA: A Chinese Table Question Answering Dataset with Implicit and Multi-type Table Structures
    Zheng, Mingyu
    Hao, Yang
    Jiang, Wenbin
    Lin, Zheng
    Lyu, Yajuan
    She, Qiaoqiao
    Wang, Weiping
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 5074 - 5094
  • [33] Applying deep matching networks to Chinese medical question answering: a study and a dataset
    Junqing He
    Mingming Fu
    Manshu Tu
    BMC Medical Informatics and Decision Making, 19
  • [34] Applying deep matching networks to Chinese medical question answering: a study and a dataset
    He, Junqing
    Fu, Mingming
    Tu, Manshu
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (Suppl 2)
  • [35] DuReadervis: A Chinese Dataset for Open-domain Document Visual Question Answering
    Qi, Le
    Lv, Shangwen
    Li, Hongyu
    Liu, Jing
    Zhang, Yu
    She, Qiaoqiao
    Wu, Hua
    Wang, Haifeng
    Liu, Ting
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1338 - 1351
  • [36] NOIRBETTIK: A reading comprehension based multiple choice question answering dataset in Bangla language
    Aurpa, Tanjim Taharat
    Apu, Md. Shahriar Hossain
    Akter, Farzana
    Rifat, Richita Khandakar
    Habib, Md. Ahsan
    DATA IN BRIEF, 2025, 59
  • [37] Multi-Turn Video Question Generation via Reinforced Multi-Choice Attention Network
    Guo, Zhaoyu
    Zhao, Zhou
    Jin, Weike
    Wei, Zhicheng
    Yang, Min
    Wang, Nannan
    Yuan, Nicholas Jing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (05) : 1697 - 1710
  • [38] CS1QA: A Dataset for Assisting Code-based Question Answering in an Introductory Programming Course
    Lee, Changyoon
    Seonwoo, Yeon
    Oh, Alice
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2026 - 2040
  • [39] Multi-Label Question Classification for Factoid and List Type Questions in Biomedical Question Answering
    Wasim, Muhammad
    Mahmood, Waqar
    Asim, Muhammad Nabeel
    Ghani, Muhammad Usman
    IEEE ACCESS, 2019, 7 : 3882 - 3896
  • [40] SkeIn: Sketchy-Intensive Reading Comprehension Model for Multi-choice Biomedical Questions
    Li, Jing
    Zhong, Shangping
    Chen, Kaizhi
    Li, Taibiao
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2021, 2021, 13064 : 561 - 571