Studying the impact of language-independent and language-specific features on hybrid Arabic Person name recognition

被引:1
|
作者
Oudah, Mai [1 ]
Shaalan, Khaled [2 ]
机构
[1] Masdar Inst Sci & Technol, Abu Dhabi, U Arab Emirates
[2] British Univ Dubai, Dubai Int Acad City, U Arab Emirates
关键词
Named entity recognition; Information extraction; Rule-based approach; Machine learning; Hybrid approach; Natural language processing; ENTITY RECOGNITION;
D O I
10.1007/s10579-016-9376-1
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, extensive experiments are conducted to study the impact of features of different categories, in isolation and gradually in an incremental manner, on Arabic Person name recognition. We present an integrated system that employs the rule-based approach with the machine learning (ML)-based approach in order to develop a consolidated hybrid system. Our feature space is comprised of language-independent and language-specific features. The explored features are naturally grouped under six categories: Person named entity tags predicted by the rule-based component, word-level features, POS features, morphological features, gazetteer features, and other contextual features. As decision tree algorithm has proved comparatively higher efficiency as a classifier in current state-of-the-art hybrid Named Entity Recognition for Arabic, it is adopted in this study as the ML technique utilized by the hybrid system. Therefore, the experiments are focused on two dimensions: the standard dataset used and the set of selected features. A number of standard datasets are used for the training and testing of the hybrid system, including ACE (2003-2004) and ANERcorp. The experimental analysis indicates that both language-independent and language-specific features play an important role in overcoming the challenges posed by Arabic language and have demonstrated critical impact on optimizing the performance of the hybrid system.
引用
收藏
页码:351 / 378
页数:28
相关论文
共 50 条
  • [1] Studying the impact of language-independent and language-specific features on hybrid Arabic Person name recognition
    Mai Oudah
    Khaled Shaalan
    Language Resources and Evaluation, 2017, 51 : 351 - 378
  • [2] The Development of Language-Specific and Language-Independent Talker Processing
    Levi, Susannah V.
    Schwartz, Richard G.
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2013, 56 (03): : 913 - 920
  • [3] Lexical activation in bilinguals' speech production:: Language-specific or language-independent?
    Colomé, A
    JOURNAL OF MEMORY AND LANGUAGE, 2001, 45 (04) : 721 - 736
  • [4] Language-Independent Tokenisation Rivals Language-Specific Tokenisation for Word Similarity Prediction
    Bollegala, Danushka
    Kiryo, Ryuichi
    Tsujino, Kosuke
    Yukawa, Haruki
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3851 - 3860
  • [5] LANGUAGE-INDEPENDENT CONSTRAINED CEPSTRAL FEATURES FOR SPEAKER RECOGNITION
    Shriberg, Elizabeth
    Stolcke, Andreas
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5296 - 5299
  • [6] Using Language Independent and Language Specific Features to Enhance Arabic Named Entity Recognition
    Benajiba, Yassine
    Diab, Mona
    Rosso, Paolo
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2009, 6 (05) : 464 - 472
  • [7] Language-Independent and Language-Specific Aspects of Early Literacy: An Evaluation of the Common Underlying Proficiency Model
    Goodrich, J. Marc
    Lonigan, Christopher J.
    JOURNAL OF EDUCATIONAL PSYCHOLOGY, 2017, 109 (06) : 782 - 793
  • [8] THE LANGUAGE-INDEPENDENT BOTTLENECK FEATURES
    Vesely, Karel
    Karafiat, Martin
    Grezl, Frantisek
    Janda, Milos
    Egorova, Ekaterina
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 336 - 341
  • [9] Language-independent computer emotion recognition
    Mitsuyoshi, S
    Ren, FJ
    Proceedings of the Ninth IASTED International Conference on Artificial Intelligence and Soft Computing, 2005, : 417 - 422
  • [10] Language-independent bases of distinctive features
    Ridouane, Rachid
    Clements, G. N.
    Khatiwada, Rajesh
    TONES AND FEATURES: PHONETIC AND PHONOLOGICAL PERSPECTIVES, 2010, 107 : 264 - 291