Multilingual and cross-lingual document classification: A meta-learning approach

被引:0
|
作者
van der Heijden, Niels [1 ]
Yannakoudakis, Helen [2 ]
Mishra, Pushkar [3 ]
Shutova, Ekaterina [1 ]
机构
[1] Univ Amsterdam, ILLC, Amsterdam, Netherlands
[2] Kings Coll London, Dept Informat, London, England
[3] Facebook AI, London, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in limitedresource setting and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint training when limited target-language data is available during training. We conduct a systematic comparison of several meta-learning methods, investigate multiple settings in terms of data availability and show that meta-learning thrives in settings with a heterogeneous task distribution. We propose a simple, yet effective adjustment to existing meta-learning methods which allows for better and more stable learning, and set a new state of the art on several languages while performing on-par on others, using only a small amount of labeled data.
引用
收藏
页码:1966 / 1976
页数:11
相关论文
共 50 条
  • [21] Cross-Lingual Transfer Learning for Multilingual Task Oriented Dialog
    Schuster, Sebastian
    Gupta, Sonal
    Shah, Rushin
    Lewis, Mike
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3795 - 3805
  • [22] Cross-Lingual Document Similarity
    Muhic, Andrej
    Rupnik, Jan
    Skraba, Primoz
    PROCEEDINGS OF THE ITI 2012 34TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES (ITI), 2012, : 387 - 392
  • [23] Cross-lingual document clustering
    Wu, Ke
    Lu, Bao-Liang
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 956 - +
  • [24] Active Learning for Cross-Lingual Sentiment Classification
    Li, Shoushan
    Wang, Rong
    Liu, Huanhuan
    Huang, Chu-Ren
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 236 - 246
  • [25] The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification
    Koksal, Abdullatif
    Ozgur, Arzucan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [26] Cross-Lingual Classification of Political Texts Using Multilingual Sentence Embeddings
    Licht, Hauke
    POLITICAL ANALYSIS, 2023, 31 (03) : 366 - 379
  • [27] Generalized Funnelling: Ensemble Learning and Heterogeneous Document Embeddings for Cross-Lingual Text Classification
    Moreo, Alejandro
    Pedrotti, Andrea
    Sebastiani, Fabrizio
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (02)
  • [28] Cross-Lingual Text Classification with Model Translation and Document Translation
    Moh, Teng-Sheng
    Zhang, Zhang
    PROCEEDINGS OF THE 50TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE, 2012,
  • [29] On cross-lingual retrieval with multilingual text encoders
    Litschko, Robert
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    INFORMATION RETRIEVAL JOURNAL, 2022, 25 (02): : 149 - 183
  • [30] A multilingual text mining approach to web cross-lingual text retrieval
    Chau, RW
    Yeh, CH
    KNOWLEDGE-BASED SYSTEMS, 2004, 17 (5-6) : 219 - 227