Bilingual Lexicon Extraction from Comparable Corpora Based on Closed Concepts Mining

被引:3
|
作者
Chebel, Mohamed [1 ]
Latiri, Chiraz [1 ]
Gaussier, Eric [2 ]
机构
[1] Univ Tunis El Manar, Fac Sci Tunis, Res Lab LIPAH, Tunis, Tunisia
[2] Univ Joseph Fourier, Res Lab LIG, Grenoble I, AMA Grp, Grenoble, France
关键词
D O I
10.1007/978-3-319-57454-7_46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose to complement the context vectors used in bilingual lexicon extraction from comparable corpora with concept vectors, that aim at capturing all the words related to the concepts associated with a given word. This allows one to rely on a representation that is less sparse, especially in specialized domains where the use of a general bilingual lexicon leaves many words untranslated. The concept vectors we are considering are based on closed concepts mining developed in Formal Concept Analysis (FCA). The obtained results on two different comparable corpora show that enriching context vectors with concept vectors leads to lexicons of higher quality, especially in specialized domains.
引用
收藏
页码:586 / 598
页数:13
相关论文
共 50 条
  • [21] Word sense acquisition from bilingual comparable corpora
    Kaji, H
    HLT-NAACL 2003: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2003, : 111 - 118
  • [22] Knowledge extraction from bilingual corpora
    Somers, H
    INFORMATION EXTRACTION: TOWARDS SCALABLE, ADAPTABLE SYSTEMS, 1999, 1714 : 120 - 133
  • [23] Bilingual Lexicon Extraction from Arabic-English Parallel Corpora with a View to Machine Translation
    Sabtan, Yasser Muhammad Naguib
    ARAB WORLD ENGLISH JOURNAL, 2016, : 317 - 336
  • [24] Bilingual Contexts from Comparable Corpora to Mine for Translations of Collocations
    Taslimipoor, Shiva
    Mitkov, Ruslan
    Pastor, Gloria Corpas
    Fazly, Afsaneh
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT II, 2018, 9624 : 115 - 126
  • [25] Terminology Extraction from Comparable Corpora for Latvian
    Gornostay, Tatiana
    Ramm, Anita
    Heid, Ulrich
    Morin, Emmanuel
    Harastani, Rima
    Planas, Emmanuel
    HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 : 66 - +
  • [26] Extraction of Interlingual Documents Clusters Based on Closed Concepts Mining
    Chebel, Mohamed
    Latiri, Chiraz
    Gaussier, Eric
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 19TH ANNUAL CONFERENCE, KES-2015, 2015, 60 : 537 - 546
  • [27] Improving Bilingual Terminology Extraction from Comparable Corpora via Multiple Word-Space Models
    Hazem, Amir
    Morin, Emmanuel
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4184 - 4187
  • [28] Combining Bilingual Lexicons Extracted from Comparable Corpora: The Complementary Approach Between Word Embedding and Text Mining
    Rhouma, Sourour Belhaj
    Latiri, Chiraz
    Berrut, Catherine
    DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA 2018), PT II, 2018, 11030 : 510 - 518
  • [29] Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages
    Ljubesic, Nikola
    Fiser, Darja
    TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 91 - 98