Multilingual and cross-lingual document classification: A meta-learning approach

被引:0
|
作者
van der Heijden, Niels [1 ]
Yannakoudakis, Helen [2 ]
Mishra, Pushkar [3 ]
Shutova, Ekaterina [1 ]
机构
[1] Univ Amsterdam, ILLC, Amsterdam, Netherlands
[2] Kings Coll London, Dept Informat, London, England
[3] Facebook AI, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in limitedresource setting and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint training when limited target-language data is available during training. We conduct a systematic comparison of several meta-learning methods, investigate multiple settings in terms of data availability and show that meta-learning thrives in settings with a heterogeneous task distribution. We propose a simple, yet effective adjustment to existing meta-learning methods which allows for better and more stable learning, and set a new state of the art on several languages while performing on-par on others, using only a small amount of labeled data.
引用
收藏
页码:1966 / 1976
页数:11
相关论文
共 50 条
  • [31] On cross-lingual retrieval with multilingual text encoders
    Robert Litschko
    Ivan Vulić
    Simone Paolo Ponzetto
    Goran Glavaš
    Information Retrieval Journal, 2022, 25 : 149 - 183
  • [32] Multilingual modeling of cross-lingual spelling variants
    Krister Lindén
    Information Retrieval, 2006, 9 : 295 - 310
  • [33] Cross-lingual thesaurus for multilingual knowledge management
    Yang, Christopher C.
    Wei, Chih-Ping
    Li, K. W.
    DECISION SUPPORT SYSTEMS, 2008, 45 (03) : 596 - 605
  • [34] Cross-lingual and multilingual ontology mapping - survey
    Ivanova, Tatyana
    COMPUTER SYSTEMS AND TECHNOLOGIES (COMPSYSTECH'18), 2018, 1641 : 50 - 57
  • [35] Multilingual modeling of cross-lingual spelling variants
    Linden, Krister
    INFORMATION RETRIEVAL, 2006, 9 (03): : 295 - 310
  • [36] Transductive Representation Learning for Cross-Lingual Text Classification
    Guo, Yuhong
    Xiao, Min
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 888 - 893
  • [37] Multilingual and Cross-Lingual Graded Lexical Entailment
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 4963 - 4974
  • [38] Massively Multilingual Document Alignment with Cross-lingual Sentence-Mover's Distance
    El-Kishky, Ahmed
    Guzman, Francisco
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 616 - 625
  • [39] Cross-lingual Short-Text Document Classification for Facebook Comments
    Faqeeh, Mosab
    Abdulla, Nawaf
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    Quwaider, Muhannad
    2014 INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD), 2014, : 573 - 578
  • [40] Cross-domain and Cross-lingual Abusive Language Detection: a Hybrid Approach with Deep Learning and a Multilingual Lexicon
    Pamungkas, Endang Wahyu
    Patti, Viviana
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 363 - 370