Cross-Lingual Training of Neural Models for Document Ranking

被引:0
|
作者
Shi, Peng [1 ]
Bai, He [1 ]
Lin, Jimmy [1 ]
机构
[1] Univ Waterloo, David R Cheriton Sch Comp Sci, Waterloo, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We tackle the challenge of cross-lingual training of neural document ranking models for mono-lingual retrieval, specifically leveraging relevance judgments in English to improve search in non-English languages. Our work successfully applies multi-lingual BERT (mBERT) to document ranking and additionally compares against a number of alternatives: translating the training data, translating documents, multi-stage hybrids, and ensembles. Experiments on test collections in six different languages from diverse language families reveal many interesting findings: modelbased relevance transfer using mBERT can significantly improve search quality in (non-English) mono-lingual retrieval, but other "low resource" approaches are competitive as well.
引用
收藏
页码:2768 / 2773
页数:6
相关论文
共 50 条
  • [41] Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies
    Kim, Yunsu
    Gao, Yingbo
    Ney, Hermann
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1246 - 1257
  • [42] Harvesting Deep Models for Cross-Lingual Image Annotation
    Wei, Qijie
    Wang, Xiaoxu
    Li, Xirong
    PROCEEDINGS OF THE 15TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2017,
  • [43] Cross-Lingual Knowledge Editing in Large Language Models
    Wang, Jiaan
    Liang, Yunlong
    Sun, Zengkui
    Cao, Yuxuan
    Xu, Jiarong
    Meng, Fandong
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 11676 - 11686
  • [44] Neural ranking models for document retrieval
    Mohamed Trabelsi
    Zhiyu Chen
    Brian D. Davison
    Jeff Heflin
    Information Retrieval Journal, 2021, 24 : 400 - 444
  • [45] Cross-lingual Models of Word Embeddings: An Empirical Comparison
    Upadhyay, Shyam
    Faruqui, Manaal
    Dyer, Chris
    Roth, Dan
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1661 - 1670
  • [46] Neural ranking models for document retrieval
    Trabelsi, Mohamed
    Chen, Zhiyu
    Davison, Brian D.
    Heflin, Jeff
    INFORMATION RETRIEVAL JOURNAL, 2021, 24 (06): : 400 - 444
  • [47] Cross-lingual Short-Text Document Classification for Facebook Comments
    Faqeeh, Mosab
    Abdulla, Nawaf
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    Quwaider, Muhannad
    2014 INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD), 2014, : 573 - 578
  • [48] CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
    El-Kishky, Ahmed
    Chaudhary, Vishrav
    Guzman, Francisco
    Koehn, Philipp
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5960 - 5969
  • [49] Multilingual and cross-lingual document classification: A meta-learning approach
    van der Heijden, Niels
    Yannakoudakis, Helen
    Mishra, Pushkar
    Shutova, Ekaterina
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1966 - 1976
  • [50] Cross-lingual document similarity estimation and dictionary generation with comparable corpora
    Stajner, Tadej
    Mladenic, Dunja
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 58 (03) : 729 - 743