Combining Bilingual Lexicons Extracted from Comparable Corpora: The Complementary Approach Between Word Embedding and Text Mining

被引:0
|
作者
Rhouma, Sourour Belhaj [1 ]
Latiri, Chiraz [1 ]
Berrut, Catherine [2 ]
机构
[1] Univ Tunis El Manar, Fac Siences Tunis, LIPAH LR11ES14, Tunis 2092, Tunisia
[2] Univ Grenoble Alpes, MRIM Grp, LIG Lab, Grenoble, France
关键词
D O I
10.1007/978-3-319-98812-2_47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, different works on bilingual lexicon extraction from comparable corpora have been proposed. This paper presents how to combine differents methods for bilingual lexicon extraction based on standard context vectors and advanced text mining methods. In this respect, we focus on combining bilingual lexicons based on context vectors, association rules and contextual meta-rules. The combination of lexicons leads to a less sparse representation in order to extract the most effective translations from these lexicons and create an optimal bilingual lexicon. An experimental validation conducted on two pairs of languages of the CLEF 2003 campaign evaluation, shows that the combination of the models give a significant improvement compared to the standard approach.
引用
收藏
页码:510 / 518
页数:9
相关论文
共 17 条
  • [1] Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages
    Ljubesic, Nikola
    Fiser, Darja
    TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 91 - 98
  • [2] Word sense acquisition from bilingual comparable corpora
    Kaji, H
    HLT-NAACL 2003: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2003, : 111 - 118
  • [3] Extraction of bilingual lexicons from comparable corpora specialty: study of the lexical context
    Hazem, Amir
    Morin, Emmanuel
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2014, 55 (01): : 13 - 44
  • [4] Bilingual Lexicon Extraction with Temporal Distributed Word Representation from Comparable Corpora
    Zhang, Chunyue
    Zhao, Tiejun
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2015, 2015, 9362 : 380 - 387
  • [5] Bilingual Lexicon Extraction from Comparable Corpora Based on Closed Concepts Mining
    Chebel, Mohamed
    Latiri, Chiraz
    Gaussier, Eric
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT I, 2017, 10234 : 586 - 598
  • [6] Towards mining bilingual lexicons and parallel phrases from large-scale monolingual corpora
    Wu, Shilong
    Wang, Xu
    Ning, Qiuyi
    Qiu, Shigui
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [7] A Holistic Approach to Bilingual Sentence Fragment Extraction from Comparable Corpora
    Khademian, Mahdi
    Taghipour, Kaveh
    Mansour, Saab
    Khadivi, Shahram
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 4073 - 4079
  • [8] Combining Lexical Context with Pseudo-alignment for Bilingual Lexicon Extraction from Comparable Corpora
    Li, Bo
    Zhu, Qunyan
    He, Tingting
    Chen, Qianjun
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014, 2014, 8801 : 223 - 233
  • [9] Improving Bilingual Terminology Extraction from Comparable Corpora via Multiple Word-Space Models
    Hazem, Amir
    Morin, Emmanuel
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4184 - 4187
  • [10] Entity Translation Mining from Comparable Corpora: Combining Graph Mapping with Corpus Latent Features
    Kim, Jinhan
    Hwang, Seung-won
    Jiang, Long
    Song, Young-in
    Zhou, Ming
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (08) : 1787 - 1800