Chinese-Vietnamese cross-lingual event retrieval method based on knowledge distillation

被引:0
|
作者
Gao S. [1 ,2 ]
He Z. [1 ,2 ]
Yu Z. [1 ,2 ]
Zhu E. [1 ,2 ]
Wu S. [1 ,2 ]
机构
[1] Kunming University of Science and Technology, Kunming
[2] Key Laboratory of Artificial Intelligence in Yunnan Province, Kunming
来源
基金
中国国家自然科学基金;
关键词
Cross-lingual; event retrieval; knowledge distillation; language bias;
D O I
10.3233/JIFS-235749
中图分类号
学科分类号
摘要
Cross-lingual event retrieval is an information retrieval task aimed at cross-lingual event retrieval among multiple languages to find text or documents related to a specific event. Specific to Chinese-Vietnamese cross-language event retrieval, it involves using Chinese as a query to retrieve Vietnamese documents related to the query event. The critical issue is how to efficiently align query and document representations with limited resources. Existing cross-language pre-training models are trained on large-scale multilingual corpora, but their training goals do not include explicit language alignment tasks. Due to the uneven distribution of training corpora between different languages, these models have The problem of language bias. Therefore, this linguistic bias is also inherited in cross-lingual retrieval based on these models. To solve this problem, this paper proposes a Chinese-Vietnamese cross-lingual event retrieval method based on knowledge distillation. This approach enables the model to learn good query-document matching features from monolingual retrieval by transferring knowledge from high-resource to low-resource languages. By enhancing the alignment between queries and documents in different languages in a shared semantic space, the method improves the performance of Chinese-Vietnamese cross-lingual event retrieval. © 2024 – IOS Press.
引用
收藏
页码:8461 / 8475
页数:14
相关论文
共 50 条
  • [21] Cross-lingual Distillation for Text Classification
    Xu, Ruochen
    Yang, Yiming
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1415 - 1425
  • [22] Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
    Liang, Shining
    Gong, Ming
    Pei, Jian
    Shou, Linjun
    Zuo, Wanli
    Zuo, Xianglin
    Jiang, Daxin
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3231 - 3239
  • [23] XKnowSearch! Exploiting Knowledge Bases for Entity-based Cross-lingual Information Retrieval
    Zhang, Lei
    Faerber, Michael
    Rettinger, Achim
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 2425 - 2428
  • [24] Mind the Gap: Cross-Lingual Information Retrieval with Hierarchical Knowledge Enhancement
    Zhang, Fuwei
    Zhang, Zhao
    Ao, Xiang
    Gao, Dehong
    Zhuang, Fuzhen
    Wei, Yi
    He, Qing
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4345 - 4353
  • [25] Semantic Cross-Lingual Information Retrieval
    Pourmahmoud, Solmaz
    Shamsfard, Mehrnoush
    23RD INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2008, : 80 - +
  • [26] Cross-Lingual Cross-Target Stance Detection with Dual Knowledge Distillation Framework
    Zhang, Ruike
    Yang, Hanxuan
    Mao, Wenji
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 10804 - 10819
  • [27] Chinese-Vietnamese bilingual news event summarization based on distributed graph ranking
    Gao, Shengxiang
    Yu, Zhengtao
    Li, Yunlong
    Wang, Yusen
    Zhang, Yafei
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (02): : 1034 - 1048
  • [28] Weakly Supervised Cross-lingual Semantic Relation Classification via Knowledge Distillation
    Vyas, Yogarshi
    Carpuat, Marine
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5285 - 5296
  • [29] A Cross-Lingual Summarization method based on cross-lingual Fact-relationship Graph Generation
    Zhang, Yongbing
    Gao, Shengxiang
    Huang, Yuxin
    Tan, Kaiwen
    Yu, Zhengtao
    PATTERN RECOGNITION, 2024, 146
  • [30] Zero-Shot Text Normalization via Cross-Lingual Knowledge Distillation
    Wang, Linqin
    Huang, Xiang
    Yu, Zhengtao
    Peng, Hao
    Gao, Shengxiang
    Mao, Cunli
    Huang, Yuxin
    Dong, Ling
    Yu, Philip S.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4631 - 4646