Bilingual Lexicon Extraction from Comparable Corpora Based on Closed Concepts Mining

被引:3
|
作者
Chebel, Mohamed [1 ]
Latiri, Chiraz [1 ]
Gaussier, Eric [2 ]
机构
[1] Univ Tunis El Manar, Fac Sci Tunis, Res Lab LIPAH, Tunis, Tunisia
[2] Univ Joseph Fourier, Res Lab LIG, Grenoble I, AMA Grp, Grenoble, France
关键词
D O I
10.1007/978-3-319-57454-7_46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose to complement the context vectors used in bilingual lexicon extraction from comparable corpora with concept vectors, that aim at capturing all the words related to the concepts associated with a given word. This allows one to rely on a representation that is less sparse, especially in specialized domains where the use of a general bilingual lexicon leaves many words untranslated. The concept vectors we are considering are based on closed concepts mining developed in Formal Concept Analysis (FCA). The obtained results on two different comparable corpora show that enriching context vectors with concept vectors leads to lexicons of higher quality, especially in specialized domains.
引用
收藏
页码:586 / 598
页数:13
相关论文
共 50 条
  • [31] Bilingual Lexicon Induction From Comparable and Parallel Data: A Comparative Analysis
    Denisova, Michaela
    Rychly, Pavel
    TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT I, 2024, 15048 : 30 - 42
  • [32] Vector Disambiguation for Translation Extraction from Comparable Corpora
    Apidianaki, Marianna
    Ljubesic, Nikola
    Fiser, Darja
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2013, 37 (02): : 193 - 202
  • [33] PEXACC: A Parallel Sentence Mining Algorithm from Comparable Corpora
    Ion, Radu
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2181 - 2188
  • [34] Mining Parallel Resources for Machine Translation from Comparable Corpora
    Pal, Santanu
    Pakray, Partha
    Gelbukh, Alexander
    van Genabith, Josef
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 534 - 544
  • [35] French-English terminology extraction from comparable corpora
    Daille, B
    Morin, E
    NATURAL LANGUAGE PROCESSING - IJCNLP 2005, PROCEEDINGS, 2005, 3651 : 707 - 718
  • [36] Evaluating a Pivot-Based Approach for Bilingual Lexicon Extraction
    Kim, Jae-Hoon
    Kwon, Hong-Seok
    Seo, Hyeong-Won
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2015, 2015
  • [37] The treatment of polysemy in the extraction of bilingual lexics from parallel corpora
    Gamallo Otero, Pablo
    Sotelo Docio, Susana
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (35): : 103 - 110
  • [38] Extraction of alignment relationships in comparable corpora based on Singular Value Decomposition
    Oliveira F.
    Wong F.
    Ho A.
    Li Y.-P.
    Chao S.
    Information Technology Journal, 2011, 10 (11) : 2076 - 2083
  • [39] Parallel Sentence Extraction from Comparable Corpora with Neural Network Features
    Chu, Chenhui
    Dabre, Raj
    Kurohashi, Sadao
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2931 - 2935
  • [40] A Multilingual Dataset for Evaluating Parallel Sentence Extraction from Comparable Corpora
    Zweigenbaum, Pierre
    Sharoff, Serge
    Rapp, Reinhard
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3828 - 3833