Multilingual information retrieval in the language modeling framework

被引:12
|
作者
Rahimi, Razieh [1 ]
Shakery, Azadeh [1 ,2 ]
King, Irwin [3 ]
机构
[1] Univ Tehran, Coll Engn, Sch Elect & Comp Engn, Tehran, Iran
[2] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
[3] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R China
来源
INFORMATION RETRIEVAL JOURNAL | 2015年 / 18卷 / 03期
关键词
Multilingual information retrieval; Multilingual language models; KL-divergence framework; Language modeling framework; Multilingual feedback; MERGING STRATEGY; SYSTEM;
D O I
10.1007/s10791-015-9255-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multilingual information retrieval (MLIR) provides results that are more comprehensive than those of mono- and cross-lingual retrieval. Methods for MLIR are categorized as: (1) Fusion-based methods that merge results from multiple retrieval runs, and (2) Direct methods that build a unique index for the entire collection. Merging results of individual runs reduces the overall effectiveness, while more effective direct methods suffer from either time complexity and memory overhead, or over-weighting of index terms. In this paper, we propose a direct MLIR approach by using the language modeling framework that includes a novel multilingual language model estimation for documents, and a new way to globally estimate word statistics. These contributions enable ranking documents in multiple languages in one retrieval phase without having the problems of the previous direct methods. Moreover, our approach has the advantage of accommodating multilingual feedback information which helps to prevent query drift, and consequently to improve the performance. Finally, we effectively address the common case of incomplete coverage of translation resources in our proposed estimation methods. Experimental results show that the proposed approach outperforms the previous MLIR approaches.
引用
收藏
页码:246 / 281
页数:36
相关论文
共 50 条
  • [1] Multilingual information retrieval in the language modeling framework
    Razieh Rahimi
    Azadeh Shakery
    Irwin King
    Information Retrieval Journal, 2015, 18 : 246 - 281
  • [2] Language Fairness in Multilingual Information Retrieval
    Yang, Eugene
    Janich, Thomas
    Mayfield, James
    Lawrie, Dawn
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2487 - 2491
  • [3] Using quantum mechanical framework for language modeling and information retrieval
    Platonov, A., V
    Poleschuk, E. A.
    Bessmertny, I. A.
    Gafurov, N. R.
    2018 IEEE 12TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2018, : 99 - 102
  • [4] Language Agnostic Multilingual Information Retrieval with Contrastive Learning
    Hu, Xiyang
    Chen, Xinchi
    Qi, Peng
    Kong, Deguang
    Liu, Kunlun
    Wang, William Yang
    Huang, Zhiheng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9133 - 9146
  • [5] Language Modeling for Information Retrieval
    Börkur Sigurbjörnsson
    Journal of Logic, Language and Information, 2004, 13 (4) : 531 - 534
  • [6] Language modeling for information retrieval
    Thompson, P
    COMPUTATIONAL LINGUISTICS, 2004, 30 (01) : 110 - 111
  • [7] Cross-language information retrieval in a multilingual legal domain
    Sheridan, P
    Braschler, M
    Schauble, P
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, 1997, 1324 : 253 - 268
  • [8] A multilingual approach to multilingual information retrieval
    Nie, JY
    Jin, F
    ADVANCES IN CROSS-LANGUAGE INFORMATION RETRIEVAL, 2003, 2785 : 101 - 110
  • [9] Multilingual information access system using cross-language information retrieval
    Hayashi, Yoshihiko
    Matsuo, Yoshihiro
    Nagata, Masaaki
    Furuse, Osamu
    2003, Nippon Telegraph and Telephone Corp. (52):
  • [10] Statistical language modeling for information retrieval
    Liu, XY
    Croft, WB
    ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 2005, 39 : 3 - 31