On the Use of Arabic Stemmers to Increase the Recall of Information Retrieval Systems

被引:0
|
作者
Nasra, Ihab [1 ]
Maree, Mohammed [2 ]
机构
[1] Arab Amer Univ, Dept Comp Sci, Jenin, Palestine
[2] Arab Amer Univ, Dept Informat Technol, Jenin, Palestine
关键词
Information Retrieval; Arabic Stemming; Morphological Analysis; Natural Language Processing; Rule-Based Stemmers;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Building robust information revival systems demands employing efficient natural language processing and morphological analysis techniques. These techniques are commonly exploited to find syntactic and semantic matches between users' queries and their corresponding documents. Word stemming is one those techniques that has been widely employed in Information Retrieval systems, namely to increase their recall. A lot of research work has been conducted to evaluate English stemming techniques. However, a little attention has been given to Arabic stemmers. In this research work, we present a comprehensive review of state-of-the-art Arabic stemming techniques and compare between them according to a variety of criteria. In addition, we classify existing Arabic stemmers into four categories: Root-based, Affix Removal, Rule-based, and Context-based techniques. We review seven of the most commonly used Arabic stemming algorithms that fall under these categories, and provide a comparative analysis and evaluation between them according to the goal, input, employed approach, and output of each technique. We conclude this study by proposing our idea of building a hybrid Arabic stemming approach that combines multiple stemmers and exploits a new set of rules to better stem Arabic words.
引用
收藏
页码:2462 / 2468
页数:7
相关论文
共 50 条
  • [31] Retrieval effort as a function of type of information and accessibility of recall
    Algarabel, S
    Pitarque, A
    Dasí, C
    PSICOTHEMA, 2002, 14 (02) : 393 - 398
  • [32] Information Retrieval Performance on Story Recall in Normal Aging
    Park, Yu-Min
    Cho, Yoo-Jung
    Kim, Nayeon
    Lee, Jiho
    Park, Ki-Su
    Yoon, Janghyeok
    Ha, Ji-Wan
    COMMUNICATION SCIENCES AND DISORDERS-CSD, 2024, 29 (04): : 859 - 873
  • [33] Semantic indexing of Arabic texts for information retrieval system
    Abderrahim, Mohammed Alaeddine
    Dib, Mohammed
    Abderrahim, Mohammed El-Amine
    Chikh, Mohammed Amine
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (02) : 229 - 236
  • [34] Pre-indexing Techniques in Arabic Information Retrieval
    Ben Guirat, Souheila
    Bounhas, Ibrahim
    Slimani, Yahia
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 237 - 246
  • [35] Should one use term proximity or multi-word terms for Arabic information retrieval?
    El Mahdaouy, Abdelkader
    Gaussier, Eric
    El Alaoui, Said Ouatik
    COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 76 - 97
  • [36] Method of Lexical Enrichment in Information Retrieval System in Arabic
    Mallat, Souheyl
    Zouaghi, Anis
    Hkiri, Emna
    Zrigui, Mounir
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2013, 3 (04) : 35 - 51
  • [37] Automatic translation of Arabic queries for Bilingual information retrieval
    Mallat, Souheyl
    Zouaghi, Anis
    Hkiri, Emna
    Zrigui, Mounir
    2013 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY AND ACCESSIBILITY (ICTA), 2013,
  • [38] Challenges in Information Retrieval from Unstructured Arabic Data
    Khalil, Hussein
    Osman, Taha
    2014 UKSIM-AMSS 16TH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION (UKSIM), 2014, : 456 - 461
  • [39] Information Retrieval from Unstructured Arabic Legal Data
    Mezghanni, Imen Bouaziz
    Gargouri, Faiez
    PRICAI 2016: TRENDS IN ARTIFICIAL INTELLIGENCE, 2016, 9810 : 44 - 54
  • [40] Arabic Information Retrieval Using Semantic Analysis of Documents
    Al-Maghasbeh, Mohammad Khaled A.
    Bin Hamzah, Mohd Pouzi
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (05): : 53 - 58