Arabic Named Entity Recognition: A Feature-Driven Study

被引:38
|
作者
Benajiba, Yassine [1 ,2 ]
Diab, Mona [3 ]
Rosso, Paolo [1 ,2 ]
机构
[1] Univ Politecn Valencia, Dept Informat Syst & Computat, Valencia 46022, Spain
[2] Univ Politecn Valencia, Nat Language Engn Lab, Valencia 46022, Spain
[3] Columbia Univ, Ctr Computat Learning Syst, New York, NY 10115 USA
关键词
Arabic; machine learning comparison; named entity recognition; natural language processing (NLP); MAXIMUM-ENTROPY;
D O I
10.1109/TASL.2009.2019927
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Named Entity Recognition task aims at identifying and classifying Named Entities within an open-domain text. This task has been garnering significant attention recently as it has been shown to help improve the performance of many Natural Language Processing applications. In this paper, we investigate the impact of using different sets of features in three discriminative machine learning frameworks, namely, support vector machines, maximum entropy and conditional random fields for the task of Named Entity Recognition. Our language of interest is Arabic. We explore lexical, contextual and morphological features and nine data-sets of different genres and annotations. We measure the impact of the different features in isolation and incrementally combine them in order to evaluate the robustness to noise of each approach. We achieve the highest performance using a combination of 15 features in conditional random fields using Broadcast News data (F(beta=1) = 83.34).
引用
收藏
页码:926 / 934
页数:9
相关论文
共 50 条
  • [21] Deep Learning Approach for Arabic Named Entity Recognition
    Gridach, Mourad
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I, 2018, 9623 : 439 - 451
  • [22] Arabic Named Entity Recognition Using Boosting Method
    Sajadi, Mohamad Bagher
    Minaei, Behrooz
    2017 19TH CSI INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2017, : 281 - 288
  • [23] Character Feature Learning for Named Entity Recognition
    Zeng, Ping
    Tan, Qingping
    Zhang, Haoyu
    Meng, Xiankai
    Zhang, Zhuo
    Xu, Jianjun
    Lei, Yan
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (07) : 1811 - 1815
  • [24] Better Feature Integration for Named Entity Recognition
    Xu, Lu
    Jie, Zhanming
    Lu, Wei
    Bing, Lidong
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3457 - 3469
  • [25] Feature Importance for Biomedical Named Entity Recognition
    Huggard, Hamish
    Zhang, Aaron
    Zhang, Edmond
    Koh, Yun Sing
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 406 - 417
  • [26] A Comparative Study of Named Entity Recognition for Arabic Using Ensemble Learning Approaches
    El bazi, Ismail
    Laachfoubi, Nabil
    2015 IEEE/ACS 12TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2015,
  • [27] Named Entity Recognition in Arabic: A Review of Some Current Systems
    Elsebai, Ali
    Meziane, Farid
    CREATING GLOBAL ECONOMIES THROUGH INNOVATION AND KNOWLEDGE MANAGEMENT: THEORY & PRACTICE, VOLS 1-3, 2009, : 1245 - 1251
  • [28] Arabic Named Entity Recognition from diverse text types
    Shaalan, Khaled
    Raza, Hafsa
    ADVANCES IN NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2008, 5221 : 440 - 451
  • [29] Data Augmentation Techniques on Arabic Data for Named Entity Recognition
    Sabty, Caroline
    Omar, Islam
    Wasfalla, Fady
    Islam, Mohamed
    Abdennadher, Slim
    AI IN COMPUTATIONAL LINGUISTICS, 2021, 189 : 292 - 299
  • [30] Simple Effective Microblog Named Entity Recognition: Arabic as an Example
    Darwish, Kareem
    Gao, Wei
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2513 - 2517