Intent detection and slot filling for Persian: Cross-lingual training for low-resource languages

被引:1
|
作者
Zadkamali, Reza [1 ]
Momtazi, Saeedeh [1 ]
Zeinali, Hossein [1 ]
机构
[1] Amirkabir Univ Technol, Tehran, Iran
来源
NATURAL LANGUAGE PROCESSING | 2025年 / 31卷 / 02期
关键词
intent detection; slot filling; Persian language understanding; joint learning; low-resource languages;
D O I
10.1017/nlp.2024.17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Intent detection and slot filling are two necessary tasks for natural language understanding. Deep neural models have already shown great ability facing sequence labeling and sentence classification tasks, but they require a large amount of training data to achieve accurate results. However, in many low-resource languages, creating accurate training data is problematic. Consequently, in most of the language processing tasks, low-resource languages have significantly lower accuracy than rich-resource languages. Hence, training models in low-resource languages with data from a richer-resource language can be advantageous. To solve this problem, in this paper, we used pretrained language models, namely multilingual BERT (mBERT) and XLM-RoBERTa, in different cross-lingual and monolingual scenarios. To evaluate our proposed model, we translated a small part of the Airline Travel Information System (ATIS) dataset into Persian. Furthermore, we repeated the experiments on the MASSIVE dataset to increase our results' reliability. Experimental results on both datasets show that the cross-lingual scenarios significantly outperform monolinguals ones.
引用
收藏
页码:559 / 574
页数:16
相关论文
共 50 条
  • [31] A Comparative Study of BNF and DNN Multilingual Training on Cross-lingual Low-resource Speech Recognition
    Xu, Haihua
    Van Hai Do
    Xiao, Xiong
    Chng, Eng-Siong
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2132 - 2136
  • [32] UniSplice: Universal Cross-Lingual Data Splicing for Low-Resource ASR
    Wang, Wei
    Qian, Yanmin
    INTERSPEECH 2023, 2023, : 2253 - 2257
  • [33] Improving Candidate Generation for Low-resource Cross-lingual Entity Linking
    Zhou, Shuyan
    Rijhwani, Shruti
    Wieting, John
    Carbonell, Jaime
    Neubig, Graham
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 109 - 124
  • [34] Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification
    Zhang, Mozhi
    Fujinuma, Yoshinari
    Boyd-Graber, Jordan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9547 - 9554
  • [35] Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization
    Effland, Thomas
    Collins, Michael
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 122 - 138
  • [36] MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning
    Xia, Mengzhou
    Zheng, Guoqing
    Mukherjee, Subhabrata
    Shokouhi, Milad
    Neubig, Graham
    Awadallah, Ahmed Hassan
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 499 - 511
  • [37] Cross-Lingual and Ensemble MLPs Strategies for Low-Resource Speech Recognition
    Qian, Yanmin
    Liu, Jia
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2581 - 2584
  • [38] Cross-Lingual Transfer of Large Language Model by Visually-Derived Supervision Toward Low-Resource Languages
    Muraoka, Masayasu
    Bhattacharjee, Bishwaranjan
    Merler, Michele
    Blackwood, Graeme
    Li, Yulong
    Zhao, Yang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3637 - 3646
  • [39] Artificial Intelligence inspired method for cross-lingual cyberhate detection from low resource languages
    Kaur, Manpreet
    Saini, Munish
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (09)
  • [40] Cross-Lingual Transfer from Related Languages: Treating Low-Resource Maltese as Multilingual Code-Switching
    Micallef, Kurt
    Habash, Nizar
    Borg, Claudia
    Eryani, Fadhl
    Bouamor, Houda
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1014 - 1025