Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models

被引:0
|
作者
Qi, Jirui [1 ]
Fernandez, Raquel [2 ]
Bisazza, Arianna [1 ]
机构
[1] Univ Groningen, Ctr Language & Cognit, Groningen, Netherlands
[2] Univ Amsterdam, Inst Log Language & Computat, Amsterdam, Netherlands
基金
欧洲研究理事会; 荷兰研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multilingual large-scale Pretrained Language Models (PLMs) have been shown to store considerable amounts of factual knowledge, but large variations are observed across languages. With the ultimate goal of ensuring that users with different language backgrounds obtain consistent feedback from the same model, we study the cross-lingual consistency (CLC) of factual knowledge in various multilingual PLMs. To this end, we propose a Ranking-based Consistency (RankC) metric to evaluate knowledge consistency across languages independently from accuracy. Using this metric, we conduct an in-depth analysis of the determining factors for CLC, both at model level and at language-pair level. Among other results, we find that increasing model size leads to higher factual probing accuracy in most languages, but does not improve cross-lingual consistency. Finally, we conduct a case study on CLC when new factual associations are inserted in the PLMs via model editing. Results on a small sample of facts inserted in English reveal a clear pattern whereby the new piece of knowledge transfers only to languages with which English has a high RankC score.
引用
收藏
页码:10650 / 10666
页数:17
相关论文
共 50 条
  • [41] mCLIP: Multilingual CLIP via Cross-lingual Transfer
    Chen, Guanhua
    Hou, Lu
    Chen, Yun
    Dai, Wenliang
    Shang, Lifeng
    Jiang, Xin
    Liu, Qun
    Pan, Jia
    Wang, Wenping
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13028 - 13043
  • [42] END-to-END Cross-Lingual Spoken Language Understanding Model with Multilingual Pretraining
    Zhang, Xianwei
    He, Liang
    INTERSPEECH 2021, 2021, : 4728 - 4732
  • [43] LEARNING CROSS-LINGUAL KNOWLEDGE WITH MULTILINGUAL BLSTM FOR EMPHASIS DETECTION WITH LIMITED TRAINING DATA
    Ning, Yishuang
    Wu, Zhiyong
    Li, Runnan
    Jia, Jia
    Xu, Mingxing
    Meng, Helen
    Cai, Lianhong
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5615 - 5619
  • [44] Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer
    Zhao, Jieyu
    Mukherjee, Subhabrata
    Hosseini, Saghar
    Chang, Kai-Wei
    Awadallah, Ahmed Hassan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2896 - 2907
  • [45] A Cross-Lingual Approach for Building Multilingual Sentiment Lexicons
    Naderalvojoud, Behzad
    Qasemizadeh, Behrang
    Kallmeyer, Laura
    Sezer, Ebru Akcapinar
    TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 259 - 266
  • [46] Monolingual, multilingual and cross-lingual code comment classification
    Kostic, Marija
    Batanovic, Vuk
    Nikolic, Bosko
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 124
  • [47] Multilingual Ontology Merging Using Cross-lingual Matching
    Ibrahim, Shimaa
    Fathalla, Said
    Lehmann, Jens
    Jabeen, Hajira
    2020 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2020), 2020, : 113 - 120
  • [48] Joint Multilingual Supervision for Cross-lingual Entity Linking
    Upadhyay, Shyam
    Gupta, Nitish
    Roth, Dan
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2486 - 2495
  • [49] Reproducing Monolingual, Multilingual and Cross-Lingual CEFR Predictions
    Bestgen, Yves
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5595 - 5602
  • [50] Exploiting Wikipedia for cross-lingual and multilingual information retrieval
    Sorg, P.
    Cimiano, P.
    DATA & KNOWLEDGE ENGINEERING, 2012, 74 : 26 - 45