KinyaBERT: a Morphology-aware Kinyarwanda Language Model

被引：0

作者：

Nzeyimana, Antoine ^{[1
]}

Rubungo, Andre Niyongabo ^{[2
]}

机构：

[1] Univ Massachusetts, Amherst, MA 01003 USA

[2] Univ Politecn Cataluna, Barcelona, Spain

来源：

PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-trained language models such as BERT have been successful at tackling many natural language processing tasks. However, the unsupervised sub-word tokenization methods commonly used in these models (e.g., byte-pair encoding - BPE) are sub-optimal at handling morphologically rich languages. Even given a morphological analyzer, naive sequencing of morphemes into a standard BERT architecture is inefficient at capturing morphological compositionality and expressing word-relative syntactic regularities. We address these challenges by proposing a simple yet effective twotier BERT architecture that leverages a morphological analyzer and explicitly represents morphological compositionality. Despite the success of BERT, most of its evaluations have been conducted on high-resource languages, obscuring its applicability on low-resource languages. We evaluate our proposed method on the low-resource morphologically rich Kinyarwanda language, naming the proposed model architecture KinyaBERT. A robust set of experimental results reveal that KinyaBERT outperforms solid baselines by 2% in F1 score on a named entity recognition task and by 4.3% in average score of a machine-translated GLUE benchmark. KinyaBERT fine-tuning has better convergence and achieves more robust results on multiple tasks even in the presence of translation noise.(1)

引用

页码：5347 / 5363

页数：17

共 50 条

[31] CxLM: A Construction and Context-aware Language Model
Tseng, Yu-Hsiang
Shih, Cing-Fang
Chen, Pin-Er
Chou, Hsin-Yu
Ku, Mao-Chang
Hsieh, Shu-Kai
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6361 - 6369
[32] Entity-Aware Language Model as an Unsupervised Reranker
Rasooli, Mohammad Sadegh
Parthasarathy, Sarangarajan
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 406 - 410
[33] Morphological Verb-Aware Tibetan Language Model
Khysru, Kuntharrgyal
Jin, Di
Dang, Jianwu
IEEE ACCESS, 2019, 7 : 72896 - 72904
[34] Morphology aware data augmentation with neural language models for online hybrid ASR
Tarjan, Balazs
Fegyo, Tibor
Mihajlik, Peter
ACTA LINGUISTICA ACADEMICA, 2022, 69 (04): : 581 - 598
[35] Morphology Model and Segmentation for Old Turkic Language
Zhanabergenova, Dinara
Tukeyev, Ualsher
COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 12876 : 629 - 642
[36] Time-aware mixed language model for microblog search
Wei, Bing-Jie
Wang, Bin
Jisuanji Xuebao/Chinese Journal of Computers, 2014, 37 (01): : 229 - 237
[37] Sharpness-Aware Minimization Improves Language Model Generalization
Bahri, Dara
Mobahi, Hossein
Tay, Yi
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7360 - 7371
[38] A Context-Aware Language Model for Spoken Query Retrieval
Yapin Zhong
Juan E. Gilbert
International Journal of Speech Technology, 2005, 8 (2) : 203 - 219
[39] A Discriminative Entity-Aware Language Model for Virtual Assistants
Saebi, Mandana
Pusateri, Ernest
Meghawat, Aaksha
Van Gysel, Christophe
INTERSPEECH 2021, 2021, : 2032 - 2036
[40] A Context-Aware Language Model for Spoken Query Retrieval
Zhong, Yapin
Gilbert, Juan E.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2005, 8 (02) : 203 - 219

← 1 2 3 4 5 →