Cross-Lingual Document Retrieval Using Regularized Wasserstein Distance

被引:2
|
作者
Balikas, Georgios [1 ]
Laclau, Charlotte [1 ]
Redko, Ievgen [2 ]
Amini, Massih-Reza [1 ]
机构
[1] Univ Grenoble Alpes, CNRS, Grenoble INP, LIG, Grenoble, France
[2] Univ Lyon, Univ Claude Bernard Lyon 1, INSA Lyon,F69XXX, UJM St Etienne,CNRS,Inserm,CREATIS UMR 5220,U1206, Lyon, France
关键词
D O I
10.1007/978-3-319-76941-7_30
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many information retrieval algorithms rely on the notion of a good distance that allows to efficiently compare objects of different nature. Recently, a new promising metric called Word Mover's Distance was proposed to measure the divergence between text passages. In this paper, we demonstrate that this metric can be extended to incorporate term-weighting schemes and provide more accurate and computationally efficient matching between documents using entropic regularization. We evaluate the benefits of both extensions in the task of cross-lingual document retrieval (CLDR). Our experimental results on eight CLDR problems suggest that the proposed methods achieve remarkable improvements in terms of Mean Reciprocal Rank compared to several baselines.
引用
收藏
页码:398 / 410
页数:13
相关论文
共 50 条
  • [21] Dictionary methods for cross-lingual information retrieval
    Ballesteros, L
    Croft, B
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, 1996, 1134 : 791 - 801
  • [22] A system for supporting cross-lingual information retrieval
    Capstick, J
    Diagne, AK
    Erbach, G
    Uszkoreit, H
    Leisenberg, A
    Leisenberg, M
    INFORMATION PROCESSING & MANAGEMENT, 2000, 36 (02) : 275 - 289
  • [23] Cross-lingual Language Model Pretraining for Retrieval
    Yu, Puxuan
    Fei, Hongliang
    Li, Ping
    PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 1029 - 1039
  • [24] On cross-lingual retrieval with multilingual text encoders
    Robert Litschko
    Ivan Vulić
    Simone Paolo Ponzetto
    Goran Glavaš
    Information Retrieval Journal, 2022, 25 : 149 - 183
  • [25] Cross-lingual Adaptation for Recipe Retrieval with Mixup
    Zhu, Bin
    Ngo, Chong-Wah
    Chen, Jingjing
    Chan, Wing-Kwong
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 258 - 267
  • [26] Massively Multilingual Document Alignment with Cross-lingual Sentence-Mover's Distance
    El-Kishky, Ahmed
    Guzman, Francisco
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 616 - 625
  • [27] Cross-lingual embedding for cross-lingual question retrieval in low-resource community question answering
    HajiAminShirazi, Shahrzad
    Momtazi, Saeedeh
    MACHINE TRANSLATION, 2020, 34 (04) : 287 - 303
  • [28] Using the Web corpus to translate the queries in cross-lingual information retrieval
    Zhang, JL
    Sun, L
    Min, JM
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 493 - 498
  • [29] Cross-lingual information retrieval and delivery using community mobile networks
    Shriram, R.
    Sugumaran, Vijayan
    Kapetanios, Epaminondas
    2006 1ST INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT, 2006, : 320 - +
  • [30] Unsupervised Cross-Lingual Information Retrieval Using Monolingual Data Only
    Litschko, Robert
    Glavas, Goran
    Ponzetto, Simone Paolo
    Vulic, Ivan
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 1253 - 1256