Cross-Lingual Document Retrieval Using Regularized Wasserstein Distance

被引:2
|
作者
Balikas, Georgios [1 ]
Laclau, Charlotte [1 ]
Redko, Ievgen [2 ]
Amini, Massih-Reza [1 ]
机构
[1] Univ Grenoble Alpes, CNRS, Grenoble INP, LIG, Grenoble, France
[2] Univ Lyon, Univ Claude Bernard Lyon 1, INSA Lyon,F69XXX, UJM St Etienne,CNRS,Inserm,CREATIS UMR 5220,U1206, Lyon, France
关键词
D O I
10.1007/978-3-319-76941-7_30
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many information retrieval algorithms rely on the notion of a good distance that allows to efficiently compare objects of different nature. Recently, a new promising metric called Word Mover's Distance was proposed to measure the divergence between text passages. In this paper, we demonstrate that this metric can be extended to incorporate term-weighting schemes and provide more accurate and computationally efficient matching between documents using entropic regularization. We evaluate the benefits of both extensions in the task of cross-lingual document retrieval (CLDR). Our experimental results on eight CLDR problems suggest that the proposed methods achieve remarkable improvements in terms of Mean Reciprocal Rank compared to several baselines.
引用
收藏
页码:398 / 410
页数:13
相关论文
共 50 条
  • [41] A method of cross-lingual consumer health information retrieval
    Neveol, Aurelie
    Pereira, Suzanne
    Soualmia, Lina F.
    Thirion, Benoit
    Darmoni, Stefan J.
    UBIQUITY: TECHNOLOGIES FOR BETTER HEALTH IN AGING SOCIETIES, 2006, 124 : 601 - 608
  • [42] Effective translation, tokenization and combination for cross-lingual retrieval
    Kamps, J
    Adafre, SF
    de Rijke, M
    MULTILINGUAL INFORMATION ACCESS FOR TEXT, SPEECH AND IMAGES, 2005, 3491 : 123 - 134
  • [43] Exploiting Wikipedia for cross-lingual and multilingual information retrieval
    Sorg, P.
    Cimiano, P.
    DATA & KNOWLEDGE ENGINEERING, 2012, 74 : 26 - 45
  • [44] Cross-Lingual Information Retrieval System for Indian Languages
    Jagarlamudi, Jagadeesh
    Kumaran, A.
    ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 80 - 87
  • [45] CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer
    Wang, Yabing
    Wang, Fan
    Dong, Jianfeng
    Luo, Hao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5651 - 5659
  • [46] Cross-lingual Diachronic Distance: Application to Portuguese and Spanish
    Pichel Campos, Jose Ramom
    Gamallo Otero, Pablo
    Alegria Loinaz, Inaki
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (63): : 77 - 84
  • [47] WikiTranslate: Query Translation for Cross-Lingual Information Retrieval Using Only Wikipedia
    Nguyen, Dong
    Overwijk, Arnold
    Hauff, Claudia
    Trieschnigg, Dolf R. B.
    Hiemstra, Djoerd
    de Jong, Franciska
    EVALUATING SYSTEMS FOR MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS, 2009, 5706 : 58 - 65
  • [48] Using query-relevant documents pairs for cross-lingual information retrieval
    Pinto, David
    Juan, Alfons
    Rosso, Paolo
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 630 - 637
  • [49] Cross-Lingual Text Classification with Model Translation and Document Translation
    Moh, Teng-Sheng
    Zhang, Zhang
    PROCEEDINGS OF THE 50TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE, 2012,
  • [50] Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning
    Zhou, Xinjie
    Wan, Xianjun
    Xiao, Jianguo
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1403 - 1412