On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

被引:0
|
作者
Wang, Zirui [1 ]
Lipton, Zachary C. [1 ]
Tsvetkov, Yulia [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern multilingual models are trained on concatenated text from multiple languages in hopes of conferring benefits to each (positive transfer), with the most pronounced benefits accruing to low-resource languages. However, recent work has shown that this approach can degrade performance on high-resource languages, a phenomenon known as negative interference. In this paper, we present the first systematic study of negative interference. We show that, contrary to previous belief, negative interference also impacts low-resource languages. While parameters are maximally shared to learn language-universal structures, we demonstrate that language-specific parameters do exist in multilingual models and they are a potential cause of negative interference. Motivated by these observations, we also present a meta-learning algorithm that obtains better cross-lingual transferability and alleviates negative interference, by adding language-specific layers as meta-parameters and training them in a manner that explicitly improves shared layers' generalization on all languages. Overall, our results show that negative interference is more common than previously known, suggesting new directions for improving multilingual representations.(1)
引用
收藏
页码:4438 / 4450
页数:13
相关论文
共 50 条
  • [1] Meta-Learning for Wireless Interference Identification
    Owfi, Ali
    Afghah, Fatemeh
    Ashdown, Jonathan
    2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC, 2023,
  • [2] Meta-Learning for Effective Multi-task and Multilingual Modelling
    Tarunesh, Ishan
    Khyalia, Sushil
    Kumar, Vishwajeet
    Ramakrishnan, Ganesh
    Jyothi, Preethi
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3600 - 3612
  • [3] Multilingual and cross-lingual document classification: A meta-learning approach
    van der Heijden, Niels
    Yannakoudakis, Helen
    Mishra, Pushkar
    Shutova, Ekaterina
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1966 - 1976
  • [4] Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection
    Awal, Md Rabiul
    Lee, Roy Ka-Wei
    Tanwar, Eshaan
    Garg, Tanmay
    Chakraborty, Tanmoy
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (01) : 1086 - 1095
  • [5] Meta-Learning Online Adaptation of Language Models
    Hu, Nathan
    Mitchell, Eric
    Manning, Christopher D.
    Finn, Chelsea
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4418 - 4432
  • [6] One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
    Nekvinda, Tomas
    Dusek, Ondrej
    INTERSPEECH 2020, 2020, : 2972 - 2976
  • [7] Towards Enabling Meta-Learning from Target Models
    Lu, Su
    Ye, Han-Jia
    Gan, Le
    Zhan, De-Chuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [8] Meta-learning based selection of software reliability models
    Rafael Caiuta
    Aurora Pozo
    Silvia Regina Vergilio
    Automated Software Engineering, 2017, 24 : 575 - 602
  • [9] Regularizing Neural Networks with Meta-Learning Generative Models
    Yamaguchi, Shin'ya
    Chijiwa, Daiki
    Kanai, Sekitoshi
    Kumagai, Atsutoshi
    Kashima, Hisashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Probabilistic programming versus meta-learning as models of cognition
    Ong, Desmond C.
    Zhi-Xuan, Tan
    Tenenbaum, Joshua B.
    Goodman, Noah D.
    BEHAVIORAL AND BRAIN SCIENCES, 2024, 47