A hybrid Approach for Arabic Multi-Word Term Extraction

被引:0
|
作者
Bounhas, Ibrahim [1 ]
Slimani, Yahya [1 ]
机构
[1] Univ Tunis, Fac Sci Tunis, Dept Comp Sci, Tunis 1060, Tunisia
关键词
Arabic language processing; morpho-syntactic parsing; multi-word terms; terminology extraction;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Building a domain model from a specialized corpus requires identifying candidate terms. It also includes identifying semantic relations between terms. Once this model is constructed it can be used for many tasks of information retrieval. In this process, multi-word terms have a great importance. In the one hand they constitute domain relevant candidate terms. On the other hand syntactic relations that link their constituents can be used to infer semantic relations between terms. In this paper we propose to extract mutli-word terms from Arabic specialized corpora. The proposed approach uses linguistic rules based on morphological features and POS (Part Of Speech) tags to parse documents and retrieve candidate terms. Statistical measures are used to deal with ambiguities generated by the linguistic tools and to rank candidate terms according to their relevance. We present experiments on a corpus from the environment domain. We report high quality results that are confirm the targets set for the precision metric.
引用
收藏
页码:429 / 436
页数:8
相关论文
共 50 条
  • [1] A multi-word term extraction program for Arabic language
    Boulaknadel, Siham
    Daille, Beatrice
    Aboutajdine, Driss
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1485 - 1488
  • [2] A multi-word term extraction system
    Chen, Jisong
    Yeh, Chung-Hsing
    Chau, Rowena
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 1160 - 1165
  • [3] Multi-word term indexing for Arabic document retrieval
    Boulaknadel, Siham
    Daille, Beatrice
    Driss, Aboutajdine
    2008 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1-3, 2008, : 480 - +
  • [4] A Language-Independent Hybrid Approach for Multi-Word Expression Extraction
    Liang, Yinghong
    Tan, Hongye
    Li, Hui
    Wang, Zhigang
    Gui, Wenming
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3273 - 3279
  • [5] Automatic Chinese Multi-Word Term Extraction
    Nari Song
    Feng, Zhiwei
    Kit, Chunyu
    ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 181 - 184
  • [6] A Contrastive Approach to Multi-word Term Extraction from Domain Corpora
    Bonin, Francesca
    Dell'Orletta, Felice
    Venturi, Giulia
    Montemagni, Simonetta
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,
  • [7] Topic Detection and Multi-word Terms Extraction for Arabic Unvowelized Documents
    Koulali, Rim
    Meziane, Ahdelouafi
    INFORMATION RETRIEVAL TECHNOLOGY, 2011, 7097 : 614 - 623
  • [8] Word Embedding Approach for Synonym Extraction of Multi-Word Terms
    Hazem, Amir
    Daille, Beatrice
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 297 - 303
  • [9] A Combined Approach for the Extraction of the Multi-word and Nested Biomedical
    Gong, Lejun
    Feng, Jiacheng
    Yang, Ronggen
    Yang, Geng
    2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2015, : 708 - 711
  • [10] A contrastive Approach to Multi-word Term Extraction from Domain-specific Corpora
    Bonin, Francesca
    Dell' Orletta, Felice
    Venturi, Giulia
    Montemagni, Simonetta
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,