Exploiting the performance of dictionary-based bio-entity name recognition in biomedical literature

被引:47
|
作者
Yang, Zhihao [1 ]
Lin, Hongfei [1 ]
Li, Yanpeng [1 ]
机构
[1] Dalian Univ Technol, Dept Comp Sci & Engn, Dalian 116023, Peoples R China
基金
中国国家自然科学基金;
关键词
text mining; entity recognition; edit distance; conditional random fields;
D O I
10.1016/j.compbiolchem.2008.03.008
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Bio-entity name recognition is the key step for information extraction from biomedical literature. This paper presents a dictionary-based bio-entity name recognition approach. The approach expands the bio-entity name dictionary via the Abbreviation Definitions identifying algorithm, improves the recall rate through the improved edit distance algorithm and adopts some post-processing methods including Pre-keyword and Post-keyword expansion, Part of Speech expansion, merge of adjacent bio-entity names and the exploitation of the contextual cues to further improve the performance. Experiment results show that with this approach even an internal dictionary-based system could achieve a fairly good performance. (C) 2008 Elsevier Ltd. All rights reserved.
引用
收藏
页码:287 / 291
页数:5
相关论文
共 50 条
  • [41] Recognition of chemical entities: combining dictionary-based and grammar-based approaches
    Saber A Akhondi
    Kristina M Hettne
    Eelke van der Horst
    Erik M van Mulligen
    Jan A Kors
    Journal of Cheminformatics, 7
  • [42] Recognition of chemical entities: combining dictionary-based and grammar-based approaches
    Akhondi, Saber A.
    Hettne, Kristina M.
    van der Horst, Eelke
    van Mulligen, Erik M.
    Kors, Jan A.
    JOURNAL OF CHEMINFORMATICS, 2015, 7
  • [43] Hybrid CNN and Dictionary-Based Models for Scene Recognition and Domain Adaptation
    Xie, Guo-Sen
    Zhang, Xu-Yao
    Yan, Shuicheng
    Liu, Cheng-Lin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (06) : 1263 - 1274
  • [44] Knowledge-Based Approach for Named Entity Recognition in Biomedical Literature: A Use Case in Biomedical Software Identification
    Amith, Muhammad
    Zhang, Yaoyun
    Xu, Hua
    Tao, Cui
    ADVANCES IN ARTIFICIAL INTELLIGENCE: FROM THEORY TO PRACTICE (IEA/AIE 2017), PT II, 2017, 10351 : 386 - 395
  • [45] A dictionary-based neural network scheme for on-line handwriting recognition
    Pastor, FM
    Dimitriadis, YA
    Garcia, RG
    Coronado, JL
    HANDWRITING AND DRAWING RESEARCH: BASIC AND APPLIED ISSUES, 1996, : 343 - 357
  • [46] A dictionary-based approach to fast and accurate name matching in large law enforcement databases
    Kursun, Olcay
    Koufakou, Anna
    Chen, Bing
    Georgiopoulos, Michael
    Reynolds, Kenneth M.
    Eaglin, Ron
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3975 : 72 - 82
  • [47] Protein Name Recognition Based on Dictionary Mining and Heuristics
    Lin, Shian-Hua
    Ding, Shao-Hong
    Zeng, Wei-Sheng
    ALGORITHMIC ASPECTS IN INFORMATION AND MANAGEMENT, AAIM 2014, 2014, 8546 : 75 - 87
  • [48] Dictionary-based discriminative hmm parameter estimation for continuous speech recognition systems
    Willett, D
    Neukirchen, C
    Rottland, J
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1515 - 1518
  • [49] Exploiting and assessing multi-source data for supervised biomedical named entity recognition
    Galea, Dieter
    Laponogov, Ivan
    Veselkov, Kirill
    BIOINFORMATICS, 2018, 34 (14) : 2474 - 2482
  • [50] Dictionary-based classifiers for exploiting feature sequence information and their application to hyperspectral remotely sensed data
    Patro, Ram Narayan
    Subudhi, Subhashree
    Biswal, Pradyut Kumar
    Dell'Acqua, Fabio
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2019, 40 (13) : 4996 - 5024