Cross-lingual transfer learning: A PARAFAC2 approach

被引:1
|
作者
Pantraki, Evangelia [1 ]
Tsingalis, Ioannis [1 ]
Kotropoulos, Constantine [1 ]
机构
[1] Aristotle Univ Thessaloniki, Dept Informat, Thessaloniki, Greece
关键词
PARAFAC2; Cross-lingual transfer learning; Cross-lingual document classification; Cross-lingual authorship attribution; Language processing; EMBEDDINGS;
D O I
10.1016/j.patrec.2022.05.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The proposed framework addresses the problem of cross-lingual transfer learning resorting to Parallel Factor Analysis 2 (PARAFAC2). To avoid the need for multilingual parallel corpora, a pairwise setting is adopted where a PARAFAC2 model is fitted to documents written in English (source language) and a different target language. Firstly, an unsupervised PARAFAC2 model is fitted to parallel unlabelled corpora pairs to learn the latent relationship between the source and target language. The fitted model is used to create embeddings for a text classification task (document classification or authorship attribution). Subsequently, a logistic regression classifier is fitted to the training source language embeddings and tested on the training target language embeddings. Following the zero-shot setting, no labels are exploited for the target language documents. The proposed framework incorporates a self-learning process by utilizing the predicted labels as pseudo-labels to train a new, pseudo-supervised PARAFAC2 model, which aims to extract latent class-specific information while fusing language-specific information. Thorough evaluation is conducted on cross-lingual document classification and cross-lingual authorship attribution. Remarkably, the proposed framework achieves competitive results when compared to deep learning methods in cross-lingual transfer learning tasks. (C) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:167 / 173
页数:7
相关论文
共 50 条
  • [41] Cross-Language Information Retrieval Using PARAFAC2
    Chew, Peter A.
    Bader, Brett W.
    Kolda, Tamara G.
    Abdelali, Ahmed
    KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2007, : 143 - +
  • [42] Cross-lingual sentiment transfer with limited resources
    Rasooli, Mohammad Sadegh
    Farra, Noura
    Radeva, Axinia
    Yu, Tao
    McKeown, Kathleen
    MACHINE TRANSLATION, 2018, 32 (1-2) : 143 - 165
  • [43] Multilingual and cross-lingual document classification: A meta-learning approach
    van der Heijden, Niels
    Yannakoudakis, Helen
    Mishra, Pushkar
    Shutova, Ekaterina
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1966 - 1976
  • [44] Cross-Lingual Knowledge Transfer for Clinical Phenotyping
    Papaioannou, Jens-Michalis
    Grundmann, Paul
    van Aken, Betty
    Samaras, Athanasios
    Kyparissidis, Ilias
    Giannakoulas, George
    Gers, Felix
    Loeser, Alexander
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 900 - 909
  • [45] PARAFAC2 and local minima
    Yu, Huiwen
    Bro, Rasmus
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2021, 219
  • [46] Metaphor Detection with Cross-Lingual Model Transfer
    Tsvetkov, Yulia
    Boytsov, Leonid
    Gershman, Anatole
    Nyberg, Eric
    Dyer, Chris
    PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 248 - 258
  • [47] Cross-Lingual Transfer of Cognitive Processing Complexity
    Pouw, Charlotte
    Hollenstein, Nora
    Beinborn, Lisa
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 655 - 669
  • [48] Enhancing Cross-lingual Natural Language Inference by Prompt-learning from Cross-lingual Templates
    Qi, Kunxun
    Wan, Hai
    Du, Jianfeng
    Chen, Haolan
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1910 - 1923
  • [49] Semi-supervised Learning on Cross-Lingual Sentiment Analysis with Space Transfer
    He, Xiaonan
    Zhang, Hui
    Chao, Wenhan
    Wang, Daqing
    2015 IEEE FIRST INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2015), 2015, : 371 - 377
  • [50] Domain Mismatch Doesn't Always Prevent Cross-Lingual Transfer Learning
    Edmiston, Daniel
    Keung, Phillip
    Smith, Noah A.
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 892 - 899