A Neural Joint Model with BERT for Burmese Syllable Segmentation, Word Segmentation, and POS Tagging

被引:4
|
作者
Mao, Cunli [1 ]
Man, Zhibo [1 ]
Yu, Zhengtao [1 ]
Gao, Shengxiang [1 ]
Wang, Zhenhan [1 ]
Wang, Hongbin [1 ]
机构
[1] Kunming Univ Sci & Technol, Key Lab Artificial Intelligence Informat Engn & A, Kunming, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Burmese; word segmentation; POS tagging; joint training; BiLSTM-CRF; BERT;
D O I
10.1145/3436818
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The smallest semantic unit of the Burmese language is called the syllable. In the present study, it is intended to propose the first neural joint learning model for Burmese syllable segmentation, word segmentation, and part-of-speech (POS) tagging with the BERT. The proposed model alleviates the error propagation problem of the syllable segmentation. More specifically, it extends the neural joint model for Vietnamese word segmentation, POS tagging, and dependency parsing [28] with the pre-training method of the Burmese character, syllable, and word embedding with BiLSTM-CRF-based neural layers. In order to evaluate the performance of the proposed model, experiments are carried out on Burmese benchmark datasets, and we fine-tune the model of multilingual BERT. Obtained results show that the proposed joint model can result in an excellent performance.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Enhanced Neural Machine Translation by Joint Decoding with Word and POS-tagging Sequences
    Feng, Xiaocheng
    Feng, Zhangyin
    Zhao, Wanlong
    Qin, Bing
    Liu, Ting
    MOBILE NETWORKS & APPLICATIONS, 2020, 25 (05): : 1722 - 1728
  • [32] Enhanced Neural Machine Translation by Joint Decoding with Word and POS-tagging Sequences
    Xiaocheng Feng
    Zhangyin Feng
    Wanlong Zhao
    Bing Qin
    Ting Liu
    Mobile Networks and Applications, 2020, 25 : 1722 - 1728
  • [33] A Single-Model Approach for Arabic Segmentation, POS Tagging, and Named Entity Recognition
    Freihat, Abed Alhakim
    Bella, Gabor
    Mubarak, Hamdy
    Giunchiglia, Fausto
    2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 152 - 159
  • [34] Conditional Random Fields for Korean Morpheme Segmentation and POS Tagging
    Na, Seung-Hoon
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2015, 14 (03)
  • [35] Burmese Word Segmentation Method and Implementation Based on CRF
    Ma, Chang'e
    Yang, Jian
    2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 340 - 343
  • [36] Role of syllable segmentation processes in peripheral word recognition
    Bernard, Jean-Baptiste
    Calabrese, Aurelie
    Castet, Eric
    VISION RESEARCH, 2014, 105 : 226 - 232
  • [37] Tibetan Word Segmentation Based on Word-position Tagging
    Kang, Caijun
    Jiang, Di
    Long, Congjun
    2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 239 - 242
  • [38] Incorporating knowledge for joint Chinese word segmentation and part-of-speech tagging with SynSemGCN
    Tang, Xuemei
    Wang, Jun
    Su, Qi
    ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2024,
  • [39] Neural POS tagging of shahmukhi by using contextualized word representations
    Tehseen, Amina
    Ehsan, Toqeer
    Bin Liaqat, Hannan
    Ali, Amjad
    Al-Fuqaha, Ala
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (01) : 335 - 356
  • [40] Unsupervised Word Segmentation with BERT Oriented Probing and Transformation
    Li, Wei
    Song, Yuhan
    Su, Qi
    Shao, Yanqiu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3935 - 3940