Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: application to named entity recognition

被引:17
|
作者
Ekbal, Asif [1 ]
Saha, Sriparna [1 ]
机构
[1] Indian Inst Technol, Dept Comp Sci & Engn, Patna, Bihar, India
关键词
Natural language processing; Named entity recognition; Maximum entropy (ME); Conditional random field (CRF); Support vector machine (SVM); Multiobjective optimization (MOO); Simulated annealing (SA); Classifier ensemble; Weighted voting; ALGORITHM; WEB;
D O I
10.1007/s00500-012-0885-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a two-stage multiobjective-simulated annealing (MOSA)-based technique for named entity recognition (NER). At first, MOSA is used for feature selection under two statistical classifiers, viz. conditional random field (CRF) and support vector machine (SVM). Each solution on the final Pareto optimal front provides a different classifier. These classifiers are then combined together by using a new classifier ensemble technique based on MOSA. Several different versions of the objective functions are exploited. We hypothesize that the reliability of prediction of each classifier differs among the various output classes. Thus, in an ensemble system, it is necessary to find out the appropriate weight of vote for each output class in each classifier. We propose a MOSA-based technique to determine the weights for votes automatically. The proposed two-stage technique is evaluated for NER in Bengali, a resource-poor language, as well as for English. Evaluation results yield the highest recall, precision and F-measure values of 93.95, 95.15 and 94.55 %, respectively for Bengali and 89.01, 89.35 and 89.18 %, respectively for English. Experiments also suggest that the classifier ensemble identified by the proposed MOO-based approach optimizing the F-measure values of named entity (NE) boundary detection outperforms all the individual classifiers and four conventional baseline models.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 50 条
  • [41] Named entity recognition in crime using machine learning approach
    Shabat, Hafedh (h2005_ali@yahoo.com), 1600, Springer Verlag (8870):
  • [42] Arabic Named Entity Recognition on Social Media based on feature selection techniques using SVM-RFE
    Ali, Brahim Ait Ben
    Mihi, Soukaina
    Bazi, Ismail El
    Laachfoubi, Nahil
    2020 FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS), 2020,
  • [43] Parameter determination of support vector machine and feature selection using simulated annealing approach
    Lin, Shih-Wei
    Lee, Zne-Jung
    Chen, Shih-Chieh
    Tseng, Tsung-Yuan
    APPLIED SOFT COMPUTING, 2008, 8 (04) : 1505 - 1512
  • [44] Improving clinical named entity recognition in Chinese using the graphical and phonetic feature
    Wang, Yifei
    Ananiadou, Sophia
    Tsujii, Jun'ichi
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (Suppl 7)
  • [45] Improving clinical named entity recognition in Chinese using the graphical and phonetic feature
    Yifei Wang
    Sophia Ananiadou
    Jun’ichi Tsujii
    BMC Medical Informatics and Decision Making, 19
  • [46] A New Ensemble Based Classifier Using Feature Transformation for Hand Recognition
    Jafarzadegan, Mohammad
    Mirzaei, Hamidreza
    2008 CONFERENCE ON HUMAN SYSTEM INTERACTIONS, VOLS 1 AND 2, 2008, : 755 - +
  • [47] The Feature Selection Based on CRFs Model for Chinese Named Entity Recognition in Micro-blog
    Li, Fang
    Du, Ya-Jun
    Zhao, Hong-Yuan
    INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMMUNICATION ENGINEERING (CSCE 2015), 2015, : 987 - 993
  • [48] Software Cost Estimation using Stacked Ensemble Classifier and Feature Selection
    Al-Karak, Mustafa Hammad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 183 - 189
  • [49] An Ensemble Classifier Based on Feature Selection Using Ant Colony Optimization
    Cao, Jianjun
    Lv, Guojun
    Shang, Yuling
    Weng, Nianfeng
    Chang, Chen
    Liu, Yi
    2018 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2018,
  • [50] Indonesian Named-entity Recognition for 15 Classes Using Ensemble Supervised Learning
    Wibawa, Aditya Satrya
    Purwarianti, Ayu
    SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 221 - 228