Multilingual and cross-lingual document classification: A meta-learning approach

被引:0
|
作者
van der Heijden, Niels [1 ]
Yannakoudakis, Helen [2 ]
Mishra, Pushkar [3 ]
Shutova, Ekaterina [1 ]
机构
[1] Univ Amsterdam, ILLC, Amsterdam, Netherlands
[2] Kings Coll London, Dept Informat, London, England
[3] Facebook AI, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in limitedresource setting and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint training when limited target-language data is available during training. We conduct a systematic comparison of several meta-learning methods, investigate multiple settings in terms of data availability and show that meta-learning thrives in settings with a heterogeneous task distribution. We propose a simple, yet effective adjustment to existing meta-learning methods which allows for better and more stable learning, and set a new state of the art on several languages while performing on-par on others, using only a small amount of labeled data.
引用
收藏
页码:1966 / 1976
页数:11
相关论文
共 50 条
  • [11] Enhanced Meta-Learning for Cross-Lingual Named Entity Recognition with Minimal Resources
    Wu, Qianhui
    Lin, Zijia
    Wang, Guoxin
    Chen, Hui
    Karlsson, Borje F.
    Huang, Biqing
    Lin, Chin-Yew
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9274 - 9281
  • [12] Cross-lingual and Multilingual CLIP
    Carlsson, Fredrik
    Eisen, Philipp
    Rekathati, Faton
    Sahlgren, Magnus
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6848 - 6854
  • [13] Multilingual seq2seq training with similarity loss for cross-lingual document classification
    Yu, Katherine
    Li, Haoran
    Oguz, Barlas
    REPRESENTATION LEARNING FOR NLP, 2018, : 175 - 179
  • [14] A Cross-Lingual Approach for Building Multilingual Sentiment Lexicons
    Naderalvojoud, Behzad
    Qasemizadeh, Behrang
    Kallmeyer, Laura
    Sezer, Ebru Akcapinar
    TEXT, SPEECH, AND DIALOGUE (TSD 2018), 2018, 11107 : 259 - 266
  • [15] Meta-ED: Cross-lingual Event Detection Using Meta-learning for Indian Languages
    Roy, Aniruddha
    Sharma, Isha
    Sarkar, Sudeshna
    Goyal, Pawan
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (02)
  • [16] Semantic Space Transformations for Cross-Lingual Document Classification
    Martinek, Jiri
    Lenc, Ladislav
    Kral, Pavel
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 608 - 616
  • [17] Heterogeneous Document Embeddings for Cross-Lingual Text Classification
    Moreo, Alejandro
    Pedrotti, Andrea
    Sebastiani, Fabrizio
    36TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2021, 2021, : 685 - 688
  • [18] Soft Layer Selection with Meta-Learning for Zero-Shot Cross-Lingual Transfer
    Xu, Weijia
    Haider, Batool
    Krone, Jason
    Mansour, Saab
    1ST WORKSHOP ON META LEARNING AND ITS APPLICATIONS TO NATURAL LANGUAGE PROCESSING (METANLP 2021), 2021, : 11 - 18
  • [19] Meta-XNLG: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation
    Maurya, Kaushal Kumar
    Desarkar, Maunendra Sankar
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 269 - 284
  • [20] Cross-Lingual Validation of Multilingual Wordnets
    Tufis, Dan
    Ion, Radu
    Barbu, Eduard
    Barbu, Verginica
    GWC 2004: SECOND INTERNATIONAL WORDNET CONFERENCE, PROCEEDINGS, 2003, : 332 - 340