Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin

被引:0
|
作者
Lin, Pin-Jie [1 ,2 ]
Saeed, Muhammed [1 ]
Chang, Ernie [3 ]
Scholman, Merel [2 ,4 ]
机构
[1] Saarland Informat Campus, Saarbrucken, Germany
[2] Saarland Univ, Language Sci & Technol, Saarbrucken, Germany
[3] Meta Inc, Real Labs, Menlo Pk, CA USA
[4] Univ Utrecht, ILS, Utrecht, Netherlands
来源
关键词
spoken language understanding; low-resource machine translation; low-resource language;
D O I
10.21437/Interspeech.2023-466
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Developing effective spoken language processing systems for low-resource languages poses several challenges due to the lack of parallel data and limited resources for fine-tuning models. In this work, we target on improving upon both text classification and translation of Nigerian Pidgin (Naija) by collecting a large-scale parallel English-Pidgin corpus and further propose a framework of cross-lingual adaptive training that includes both continual and task adaptive training so as to adapt a base pre-trained model to low-resource languages. Our studies show that English pre-trained language models serve as a stronger prior than multilingual language models on English-Pidgin tasks with up to 2.38 BLEU improvements; and demonstrate that augmenting orthographic data and using task adaptive training with back-translation can have a significant impact on model performance.
引用
收藏
页码:3954 / 3958
页数:5
相关论文
共 50 条
  • [41] XAlign: Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages
    Abhishek, Tushar
    Sagare, Shivprasad
    Singh, Bhavyajeet
    Sharma, Anubhav
    Gupta, Manish
    Varma, Vasudeva
    COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 171 - 175
  • [42] Cross-Lingual Transfer with Language-Specific Subnetworks for Low-Resource Dependency Parsing
    Choenni, Rochelle
    Garrette, Dan
    Shutova, Ekaterina
    COMPUTATIONAL LINGUISTICS, 2023, 49 (03) : 613 - 641
  • [43] Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition
    Cahyawijaya, Samuel
    Lovenia, Holy
    Chung, Willy
    Frieske, Rita
    Liu, Zihan
    Fung, Pascale
    INTERSPEECH 2023, 2023, : 3352 - 3356
  • [44] Cross-lingual transfer learning during supervised training in low resource scenarios
    Das, Amit
    Hasegawa-Johnson, Mark
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3531 - 3535
  • [45] Cross-Lingual Contrastive Learning for Fine-Grained Entity Typing for Low-Resource Languages
    Han, Xu
    Luo, Yuqi
    Chen, Weize
    Liu, Zhiyuan
    Sun, Maosong
    Zhou, Botong
    Hao, Fei
    Zheng, Suncong
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2241 - 2250
  • [46] Improving cross-lingual low-resource speech recognition by Task-based Meta PolyLoss
    Chen, Yaqi
    Zhang, Hao
    Yang, Xukui
    Zhang, Wenlin
    Qu, Dan
    COMPUTER SPEECH AND LANGUAGE, 2024, 87
  • [47] Cross-lingual Multi-Level Adversarial Transfer to Enhance Low-Resource Name Tagging
    Huang, Lifu
    Ji, Heng
    May, Jonathan
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3823 - 3833
  • [48] C2LIR: Continual Cross-Lingual Transfer for Low-Resource Information Retrieval
    Lee, Jaeseong
    Lee, Dohyeon
    Kim, Jongho
    Hwang, Seung-Won
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II, 2023, 13981 : 466 - 474
  • [49] Cross-lingual Sentence Embedding for Low-resource Chinese-Vietnamese Based on Contrastive Learning
    Huang, Yuxin
    Liang, Yin
    Wu, Zhaoyuan
    Zhu, Enchang
    Yu, Zhengtao
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (06)
  • [50] AfriWOZ: Corpus for Exploiting Cross-Lingual Transfer for Dialogue Generation in Low-Resource, African Languages
    Adewumi, Tosin
    Adeyemi, Mofetoluwa
    Anuoluwapo, Aremu
    Peters, Bukola
    Buzaaba, Happy
    Samuel, Oyerinde
    Rufai, Amina Mardiyyah
    Ajibade, Benjamin
    Gwadabe, Tajudeen
    Traore, Mory Moussou Koulibaly
    Ajayi, Tunde Oluwaseyi
    Muhammad, Shamsuddeen
    Baruwa, Ahmed
    Owoicho, Paul
    Ogunremi, Tolulope
    Ngigi, Phylis
    Ahia, Orevaoghene
    Nasir, Ruqayya
    Liwicki, Foteini
    Liwicki, Marcus
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,