Arabic Named Entity Recognition: A Feature-Driven Study

被引:38
|
作者
Benajiba, Yassine [1 ,2 ]
Diab, Mona [3 ]
Rosso, Paolo [1 ,2 ]
机构
[1] Univ Politecn Valencia, Dept Informat Syst & Computat, Valencia 46022, Spain
[2] Univ Politecn Valencia, Nat Language Engn Lab, Valencia 46022, Spain
[3] Columbia Univ, Ctr Computat Learning Syst, New York, NY 10115 USA
关键词
Arabic; machine learning comparison; named entity recognition; natural language processing (NLP); MAXIMUM-ENTROPY;
D O I
10.1109/TASL.2009.2019927
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Named Entity Recognition task aims at identifying and classifying Named Entities within an open-domain text. This task has been garnering significant attention recently as it has been shown to help improve the performance of many Natural Language Processing applications. In this paper, we investigate the impact of using different sets of features in three discriminative machine learning frameworks, namely, support vector machines, maximum entropy and conditional random fields for the task of Named Entity Recognition. Our language of interest is Arabic. We explore lexical, contextual and morphological features and nine data-sets of different genres and annotations. We measure the impact of the different features in isolation and incrementally combine them in order to evaluate the robustness to noise of each approach. We achieve the highest performance using a combination of 15 features in conditional random fields using Broadcast News data (F(beta=1) = 83.34).
引用
收藏
页码:926 / 934
页数:9
相关论文
共 50 条
  • [41] Bidirectional Encoder-Decoder Model for Arabic Named Entity Recognition
    Ali, Mohammed N. A.
    Tan, Guanzheng
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (11) : 9693 - 9701
  • [42] Bidirectional Recurrent Neural Network Approach for Arabic Named Entity Recognition
    Ali, Mohammed N. A.
    Tan, Guanzheng
    Hussain, Aamir
    FUTURE INTERNET, 2018, 10 (12):
  • [43] Arabic Named Entity Recognition: A Bidirectional GRU-CRF Approach
    Gridach, Mourad
    Haddad, Hatem
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I, 2018, 10761 : 264 - 275
  • [44] Arabic named entity recognition via deep co-learning
    Chadi Helwe
    Shady Elbassuoni
    Artificial Intelligence Review, 2019, 52 : 197 - 215
  • [45] ANERsys: An Arabic Named Entity Recognition system based on maximum entropy
    Benajiba, Yassine
    Rosso, Paolo
    Ruiz, Jose Miguel Benedi
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2007, 4394 : 143 - +
  • [46] Arabic named entity recognition via deep co-learning
    Helwe, Chadi
    Elbassuoni, Shady
    ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (01) : 197 - 215
  • [47] A real time Named Entity Recognition system for Arabic text mining
    Al-Jumaily, Harith
    Martinez, Paloma
    Martinez-Fernandez, Jose L.
    Van der Goot, Erik
    LANGUAGE RESOURCES AND EVALUATION, 2012, 46 (04) : 543 - 563
  • [48] Enhancing Deep Learning with Embedded Features for Arabic Named Entity Recognition
    Lotfy, Ali
    Sabty, Caroline
    Abdennadher, Slim
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 4904 - 4912
  • [49] Wojood: Nested Arabic Named Entity Corpus and Recognition using BERT
    Jarrar, Mustafa
    Khalilia, Mohammed
    Ghanem, Sana
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3626 - 3636
  • [50] Arabic Named Entity Recognition: What Works and What's Next
    Liu, Liyuan
    Shang, Jingbo
    Han, Jiawei
    FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 60 - 67