On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

被引:0
|
作者
Wang, Zirui [1 ]
Lipton, Zachary C. [1 ]
Tsvetkov, Yulia [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern multilingual models are trained on concatenated text from multiple languages in hopes of conferring benefits to each (positive transfer), with the most pronounced benefits accruing to low-resource languages. However, recent work has shown that this approach can degrade performance on high-resource languages, a phenomenon known as negative interference. In this paper, we present the first systematic study of negative interference. We show that, contrary to previous belief, negative interference also impacts low-resource languages. While parameters are maximally shared to learn language-universal structures, we demonstrate that language-specific parameters do exist in multilingual models and they are a potential cause of negative interference. Motivated by these observations, we also present a meta-learning algorithm that obtains better cross-lingual transferability and alleviates negative interference, by adding language-specific layers as meta-parameters and training them in a manner that explicitly improves shared layers' generalization on all languages. Overall, our results show that negative interference is more common than previously known, suggesting new directions for improving multilingual representations.(1)
引用
收藏
页码:4438 / 4450
页数:13
相关论文
共 50 条
  • [21] Meta-Learning Without Data via Unconditional Diffusion Models
    Wei, Yongxian
    Hu, Zixuan
    Shen, Li
    Wang, Zhenyi
    Li, Lei
    Li, Yu
    Yuan, Chun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 11874 - 11885
  • [22] Meta-learning representations for clustering with infinite Gaussian mixture models
    Iwata, Tomoharu
    NEUROCOMPUTING, 2023, 549
  • [23] Learning together: Towards foundation models for machine learning interatomic potentials with meta-learning
    Allen, Alice E. A.
    Lubbers, Nicholas
    Matin, Sakib
    Smith, Justin
    Messerly, Richard
    Tretiak, Sergei
    Barros, Kipton
    NPJ COMPUTATIONAL MATERIALS, 2024, 10 (01)
  • [24] Meta-Learning Based Few Pilots Demodulation and Interference Cancellation For NOMA Uplink
    Issa, Hebatalla
    Shehab, Mohammad
    Alves, Hirley
    2023 JOINT EUROPEAN CONFERENCE ON NETWORKS AND COMMUNICATIONS & 6G SUMMIT, EUCNC/6G SUMMIT, 2023, : 84 - 89
  • [25] Meta-features for meta-learning
    Rivolli, Adriano
    Garcia, Luis P. F.
    Soares, Carlos
    Vanschoren, Joaquin
    de Carvalho, Andre C. P. L. F.
    KNOWLEDGE-BASED SYSTEMS, 2022, 240
  • [26] Meta-features for meta-learning
    Rivolli, Adriano
    Garcia, Luís P.F.
    Soares, Carlos
    Vanschoren, Joaquin
    de Carvalho, André C.P.L.F.
    Knowledge-Based Systems, 2022, 240
  • [27] Hierarchical Meta-Learning in Time Series Forecasting for Improved Interference-Less Machine Learning
    Afolabi, David
    Guan, Sheng-Uei
    Man, Ka Lok
    Wong, Prudence W. H.
    Zhao, Xuan
    SYMMETRY-BASEL, 2017, 9 (11):
  • [28] Meta-Modelling Meta-Learning
    Hartmann, Thomas
    Moawad, Assaad
    Schockaert, Cedric
    Fouquet, Francois
    Le Traon, Yves
    2019 ACM/IEEE 22ND INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS (MODELS 2019), 2019, : 300 - 305
  • [29] Learning Tensor Representations for Meta-Learning
    Deng, Samuel
    Guo, Yilin
    Hsu, Daniel
    Mandal, Debmalya
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [30] Meta-learning for fast incremental learning
    Oohira, T
    Yamauchi, K
    Omori, T
    ARTIFICAIL NEURAL NETWORKS AND NEURAL INFORMATION PROCESSING - ICAN/ICONIP 2003, 2003, 2714 : 157 - 164