Improving chemical entity recognition through h-index based semantic similarity

被引:12
|
作者
Lamurias, Andre [1 ]
Ferreira, Joao D. [1 ]
Couto, Francisco M. [1 ]
机构
[1] Univ Lisbon, LaSIGE, Dept Informat, Fac Ciencias, P-1749016 Lisbon, Portugal
来源
关键词
CHEMDNER; DRUGS;
D O I
10.1186/1758-2946-7-S1-S13
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Background: Our approach to the BioCreative IV challenge of recognition and classification of drug names (CHEMDNER task) aimed at achieving high levels of precision by applying semantic similarity validation techniques to Chemical Entities of Biological Interest (ChEBI) mappings. Our assumption is that the chemical entities mentioned in the same fragment of text should share some semantic relation. This validation method was further improved by adapting the semantic similarity measure to take into account the h-index of each ancestor. We applied this method in two measures, simUI and simGIC, and validated the results obtained for the competition, comparing each adapted measure to its original version. Results: For the competition, we trained a Random Forest classifier that uses various scores provided by our system, including semantic similarity, which improved the F-measure obtained with the Conditional Random Fields classifiers by 4.6%. Using a notion of concept relevance based on the h-index measure, we were able to enhance our validation process so that for a fixed recall, we increased precision by excluding from the results a higher amount of false positives. We plotted precision and recall values for a range of validation thresholds using different similarity measures, obtaining higher precision values for the same recall with the measures based on the h-index. Conclusions: The semantic similarity measure we introduced was more efficient at validating text mining results from machine learning classifiers than other measures. We improved the results we obtained for the CHEMDNER task by maintaining high precision values while improving the recall and F-measure.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Detecting h-index manipulation through self-citation analysis
    Christoph Bartneck
    Servaas Kokkelmans
    Scientometrics, 2011, 87 : 85 - 98
  • [32] Detecting h-index manipulation through self-citation analysis
    Bartneck, Christoph
    Kokkelmans, Servaas
    SCIENTOMETRICS, 2011, 87 (01) : 85 - 98
  • [33] The Hl-index: improvement of H-index based on quality of citing papers
    Li Zhai
    Xiangbin Yan
    Bin Zhu
    Scientometrics, 2014, 98 : 1021 - 1031
  • [34] Measuring the Influence of Bloggers in Their Community Based on the H-index Family
    Dinh-Luyen Bui
    Tri-Thanh Nguyen
    Quang-Thuy Ha
    ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2014, 282 : 313 - 324
  • [35] New journal classification methods based on the global h-index
    Xu, F.
    Liu, W. B.
    Mingers, J.
    INFORMATION PROCESSING & MANAGEMENT, 2015, 51 (02) : 50 - 61
  • [36] Scientific Collaboration Sustainability Prediction Based on H-index Reciprocity
    Wang, Wei
    Chen, Junyang
    Sun, Weiwei
    Gong, Zhiguo
    WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 71 - 72
  • [37] EARA: Improving Biomedical Semantic Textual Similarity with Entity-Aligned Attention and Retrieval Augmentation
    Xiong, Ying
    Yang, Xin
    Liu, Lijing
    Wong, Ka-Chun
    Chen, Qingcai
    Xiang, Yang
    Tang, Buzhou
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8760 - 8771
  • [38] A critical view of the h-index:: observations based on a practical application
    Costas, Rodrigo
    Bordons, Maria
    PROFESIONAL DE LA INFORMACION, 2007, 16 (05): : 427 - 432
  • [39] A Dynamic Management Mechanism Based on H-Index Secondary Indexes
    Qin, Heng
    Xiong, Anping
    Tian, Yuan
    PROCEEDINGS OF THE 2016 INTERNATIONAL SYMPOSIUM ON ADVANCES IN ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING (ISAEECE), 2016, 69 : 173 - 180
  • [40] Agent-based model for the h-index – exact solution
    Barbara Żogała-Siudem
    Grzegorz Siudem
    Anna Cena
    Marek Gagolewski
    The European Physical Journal B, 2016, 89