On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

被引：0

作者：

Wang, Zirui ^{[1
]}

Lipton, Zachary C. ^{[1
]}

Tsvetkov, Yulia ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

来源：

PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Modern multilingual models are trained on concatenated text from multiple languages in hopes of conferring benefits to each (positive transfer), with the most pronounced benefits accruing to low-resource languages. However, recent work has shown that this approach can degrade performance on high-resource languages, a phenomenon known as negative interference. In this paper, we present the first systematic study of negative interference. We show that, contrary to previous belief, negative interference also impacts low-resource languages. While parameters are maximally shared to learn language-universal structures, we demonstrate that language-specific parameters do exist in multilingual models and they are a potential cause of negative interference. Motivated by these observations, we also present a meta-learning algorithm that obtains better cross-lingual transferability and alleviates negative interference, by adding language-specific layers as meta-parameters and training them in a manner that explicitly improves shared layers' generalization on all languages. Overall, our results show that negative interference is more common than previously known, suggesting new directions for improving multilingual representations.(1)

引用

页码：4438 / 4450

页数：13

共 50 条

[1] Meta-Learning for Wireless Interference Identification
Owfi, Ali
Afghah, Fatemeh
Ashdown, Jonathan
2023 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC, 2023,
[2] Meta-Learning for Effective Multi-task and Multilingual Modelling
Tarunesh, Ishan
Khyalia, Sushil
Kumar, Vishwajeet
Ramakrishnan, Ganesh
Jyothi, Preethi
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3600 - 3612
[3] Multilingual and cross-lingual document classification: A meta-learning approach
van der Heijden, Niels
Yannakoudakis, Helen
Mishra, Pushkar
Shutova, Ekaterina
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1966 - 1976
[4] Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection
Awal, Md Rabiul
Lee, Roy Ka-Wei
Tanwar, Eshaan
Garg, Tanmay
Chakraborty, Tanmoy
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (01) : 1086 - 1095
[5] Meta-Learning Online Adaptation of Language Models
Hu, Nathan
Mitchell, Eric
Manning, Christopher D.
Finn, Chelsea
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4418 - 4432
[6] One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
Nekvinda, Tomas
Dusek, Ondrej
INTERSPEECH 2020, 2020, : 2972 - 2976
[7] Towards Enabling Meta-Learning from Target Models
Lu, Su
Ye, Han-Jia
Gan, Le
Zhan, De-Chuan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
[8] Meta-learning based selection of software reliability models
Rafael Caiuta
Aurora Pozo
Silvia Regina Vergilio
Automated Software Engineering, 2017, 24 : 575 - 602
[9] Regularizing Neural Networks with Meta-Learning Generative Models
Yamaguchi, Shin'ya
Chijiwa, Daiki
Kanai, Sekitoshi
Kumagai, Atsutoshi
Kashima, Hisashi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[10] Probabilistic programming versus meta-learning as models of cognition
Ong, Desmond C.
Zhi-Xuan, Tan
Tenenbaum, Joshua B.
Goodman, Noah D.
BEHAVIORAL AND BRAIN SCIENCES, 2024, 47

← 1 2 3 4 5 →