A hybrid Approach for Arabic Multi-Word Term Extraction

被引:0
|
作者
Bounhas, Ibrahim [1 ]
Slimani, Yahya [1 ]
机构
[1] Univ Tunis, Fac Sci Tunis, Dept Comp Sci, Tunis 1060, Tunisia
关键词
Arabic language processing; morpho-syntactic parsing; multi-word terms; terminology extraction;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Building a domain model from a specialized corpus requires identifying candidate terms. It also includes identifying semantic relations between terms. Once this model is constructed it can be used for many tasks of information retrieval. In this process, multi-word terms have a great importance. In the one hand they constitute domain relevant candidate terms. On the other hand syntactic relations that link their constituents can be used to infer semantic relations between terms. In this paper we propose to extract mutli-word terms from Arabic specialized corpora. The proposed approach uses linguistic rules based on morphological features and POS (Part Of Speech) tags to parse documents and retrieve candidate terms. Statistical measures are used to deal with ambiguities generated by the linguistic tools and to rank candidate terms according to their relevance. We present experiments on a corpus from the environment domain. We report high quality results that are confirm the targets set for the precision metric.
引用
收藏
页码:429 / 436
页数:8
相关论文
共 50 条
  • [21] Extraction of multi-word expressions from small parallel corpora
    Tsvetkov, Yulia
    Wintner, Shuly
    NATURAL LANGUAGE ENGINEERING, 2012, 18 : 549 - 573
  • [22] Multi-word collocation extraction by syntactic composition of collocation bigrams
    Seretan, V
    Nerima, L
    Wehrli, E
    RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING III, 2004, 260 : 91 - 100
  • [23] TR9856: A Multi-word Term Relatedness Benchmark
    Levy, Ran
    Ein-Dor, Liat
    Hummel, Shay
    Rinott, Ruty
    Slonim, Noam
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 419 - 424
  • [24] Building Comparable Corpora for Assessing Multi-Word Term Alignment
    Adjali, Omar
    Morin, Emmanuel
    Zweigenbaum, Pierre
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3103 - 3112
  • [25] Semi-compositional Method for Synonym Extraction of Multi-Word Terms
    Hazem, Amir
    Daille, Beatrice
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1202 - 1207
  • [26] Lexical Inference over Multi-Word Predicates: A Distributional Approach
    Abend, Omri
    Cohen, Shay B.
    Steedman, Mark
    PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 644 - 654
  • [27] Acronyms as an Integral Part of Multi-Word Term Recognition - A Token of Appreciation
    Spasic, Irena
    IEEE ACCESS, 2018, 6 : 8351 - 8363
  • [28] Multi-word Term Translation: A Student-Centered Pilot Study
    Bullon, Sandra
    Leon-Arauz, Pilar
    COMPUTATIONAL AND CORPUS-BASED PHRASEOLOGY, 2022, 13528 : 47 - 61
  • [29] On the Creation of a Corpus-Derived Medical Multi-Word Term List
    Florescu, Cosmin Mihail
    Ohniwa, Ryosuke L.
    INFORMATION, 2025, 16 (02)
  • [30] Multi-word term variation Prepositional and adjectival complex nominals in Spanish
    Cabezas-Garcia, Melania
    Chambo, Santiago
    REVISTA ESPANOLA DE LINGUISTICA APLICADA, 2021, 34 (02): : 402 - 434