Chinese-Vietnamese cross-lingual event retrieval method based on knowledge distillation

被引:0
|
作者
Gao S. [1 ,2 ]
He Z. [1 ,2 ]
Yu Z. [1 ,2 ]
Zhu E. [1 ,2 ]
Wu S. [1 ,2 ]
机构
[1] Kunming University of Science and Technology, Kunming
[2] Key Laboratory of Artificial Intelligence in Yunnan Province, Kunming
来源
基金
中国国家自然科学基金;
关键词
Cross-lingual; event retrieval; knowledge distillation; language bias;
D O I
10.3233/JIFS-235749
中图分类号
学科分类号
摘要
Cross-lingual event retrieval is an information retrieval task aimed at cross-lingual event retrieval among multiple languages to find text or documents related to a specific event. Specific to Chinese-Vietnamese cross-language event retrieval, it involves using Chinese as a query to retrieve Vietnamese documents related to the query event. The critical issue is how to efficiently align query and document representations with limited resources. Existing cross-language pre-training models are trained on large-scale multilingual corpora, but their training goals do not include explicit language alignment tasks. Due to the uneven distribution of training corpora between different languages, these models have The problem of language bias. Therefore, this linguistic bias is also inherited in cross-lingual retrieval based on these models. To solve this problem, this paper proposes a Chinese-Vietnamese cross-lingual event retrieval method based on knowledge distillation. This approach enables the model to learn good query-document matching features from monolingual retrieval by transferring knowledge from high-resource to low-resource languages. By enhancing the alignment between queries and documents in different languages in a shared semantic space, the method improves the performance of Chinese-Vietnamese cross-lingual event retrieval. © 2024 – IOS Press.
引用
收藏
页码:8461 / 8475
页数:14
相关论文
共 50 条
  • [31] Cross-Lingual Sentence Extraction for Information Distillation
    Singla, Adish Kumar
    Hakkani-Tuer, Dilek
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2707 - 2710
  • [32] A method of Chinese and Thai cross-lingual query expansion based on comparable corpus
    Tang P.
    Zhao J.
    Yu Z.
    Wang Z.
    Xian Y.
    Yu, Zhengtao (ztyu@hotmail.com), 2017, Korea Information Processing Society (13): : 805 - 817
  • [33] Cross-Lingual Image Retrieval Interactions Based on a Game Competition
    Di Nunzio, Giorgio Maria
    EVALUATING SYSTEMS FOR MULTILINGUAL AND MULTIMODAL INFORMATION ACCESS, 2009, 5706 : 243 - 250
  • [34] Conversations Powered by Cross-Lingual Knowledge
    Sun, Weiwei
    Meng, Chuan
    Meng, Qi
    Ren, Zhaochun
    Ren, Pengjie
    Chen, Zhumin
    de Rijke, Maarten
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1442 - 1451
  • [35] On cross-lingual retrieval with multilingual text encoders
    Litschko, Robert
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    INFORMATION RETRIEVAL JOURNAL, 2022, 25 (02): : 149 - 183
  • [36] Cross-lingual information retrieval by feature vectors
    Lilleng, Jeanine
    Tomassen, Stein L.
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2007, 4592 : 229 - +
  • [37] The application of the comparable corpora in Chinese-English Cross-Lingual Information Retrieval
    Du, L
    Zhang, YB
    Sun, L
    Sun, YF
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2001, 16 (04) : 351 - 358
  • [38] A Low Cost Machine Translation Method for Cross-Lingual Information Retrieval
    Bracewell, David B.
    Ren, Fuji
    Kuroiwa, Shingo
    ENGINEERING LETTERS, 2008, 16 (01)
  • [39] The application of the comparable corpora in Chinese-English Cross-Lingual Information Retrieval
    Lin Du
    Yibo Zhang
    Le Sun
    Yufang Sun
    Journal of Computer Science and Technology, 2001, 16 : 351 - 358
  • [40] The Application of the Comparable Corpora in Chinese-English Cross-Lingual Information Retrieval
    杜林
    张毅波
    孙乐
    孙玉芳
    Journal of Computer Science and Technology, 2001, (04) : 351 - 358