A comparative study for biomedical named entity recognition

被引:40
|
作者
Wang, Xu [1 ]
Yang, Chen [2 ]
Guan, Renchu [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
[2] Jilin Univ, Coll Earth Sci, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
基金
中国国家自然科学基金;
关键词
Biomedical named entity recognition; Machine learning; HMM; CRF; DICTIONARY; TEXT;
D O I
10.1007/s13042-015-0426-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With high-throughput technologies applied in biomedical research, the quantity of biomedical literatures grows exponentially. It becomes more and more important to quickly as well as accurately extract knowledge from manuscripts, especially in the era of big data. Named entity recognition (NER), aiming at identifying chunks of text that refers to specific entities, is essentially the initial step for information extraction. In this paper, we will review the three models of biomedical NER and two famous machine learning methods, Hidden Markov Model and Conditional Random Fields, which have been widely applied in bioinformatics. Based on these two methods, six excellent biomedical NER tools are compared in terms of programming language, feature sets, underlying mathematical methods, post-processing techniques and flowcharts. Experimental results of these tools against two widely used corpora, GENETAG and JNLPBA, are conducted. The comparison varies from different entity types to the overall performance. Furthermore, we put forward suggestions about the selection of Bio-NER tools for different applications.
引用
收藏
页码:373 / 382
页数:10
相关论文
共 50 条
  • [31] Towards reliable named entity recognition in the biomedical domain
    Giorgi, John M.
    Bader, Gary D.
    BIOINFORMATICS, 2020, 36 (01) : 280 - 286
  • [32] Improving biomedical named entity recognition with syntactic information
    Tian, Yuanhe
    Shen, Wang
    Song, Yan
    Xia, Fei
    He, Min
    Li, Kenli
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [33] Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis
    Han, Peifu
    Li, Xue
    Wang, Xun
    Wang, Shuang
    Gao, Changnan
    Chen, Wenqi
    FRONTIERS IN PHARMACOLOGY, 2022, 13
  • [34] Comparative study of text representation and learning for Persian named entity recognition
    Pour, Mohammad Mahdi Abdollah
    Momtazi, Saeedeh
    ETRI JOURNAL, 2022, 44 (05) : 794 - 804
  • [35] Distantly Supervised Biomedical Named Entity Recognition with Dictionary Expansion
    Wang, Xuan
    Zhang, Yu
    Li, Qi
    Ren, Xiang
    Shang, Jingbo
    Han, Jiawei
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 496 - 503
  • [36] Biomedical named entity recognition using generalized expectation criteria
    Lin Yao
    Chengjie Sun
    Yan Wu
    Xiaolong Wang
    Xuan Wang
    International Journal of Machine Learning and Cybernetics, 2011, 2 : 235 - 243
  • [37] Transfer learning for biomedical named entity recognition with neural networks
    Giorgi, John M.
    Bader, Gary D.
    BIOINFORMATICS, 2018, 34 (23) : 4087 - 4094
  • [38] Named entity recognition in Turkish: A comparative study with detailed error analysis
    Ozcelik, Oguzhan
    Toraman, Cagri
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (06)
  • [39] A comparative study of Chinese named entity recognition with different segment representations
    Pan, Jun
    Zhang, Chaohua
    Wang, Haijun
    Wu, Zongda
    APPLIED INTELLIGENCE, 2022, 52 (11) : 12457 - 12469
  • [40] A comparative study of Chinese named entity recognition with different segment representations
    Jun Pan
    Chaohua Zhang
    Haijun Wang
    Zongda Wu
    Applied Intelligence, 2022, 52 : 12457 - 12469