Neural Architecture for Tibetan Word Segmentation

被引:0
|
作者
Chen, Mengzhu [1 ]
Zhao, Shengjie [1 ,2 ]
Yang, Kai [1 ]
机构
[1] Tongji Univ, Coll Elect & Informat Engn, Shanghai, Peoples R China
[2] Tongji Univ, Coll Software Engn, Shanghai, Peoples R China
基金
美国国家科学基金会;
关键词
Tibetan word segmentation; conditional random field; contracted words recognition; recurrent neural network; Long short-term memory;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tibetan word segmentation (TWS) is a primary task for Tibetan language processing. In this paper, a novel hybrid neural architecture is proposed to solve TWS which is considered as a sequence tagging task. Due to the high frequency of the contracted words in Tibetan sentences, we firstly use conditional random field (CRF) to deal with this problem. Then we use the character embedding method to obtain basic character representation as input. Most importantly, we apply bi-directional Long short-term memory and CRF (BiLSTM-CRF) to our system. Experimental result shows that our approach obtained state-of-art performance compared with previous approaches used in TWS.
引用
收藏
页码:367 / 370
页数:4
相关论文
共 50 条
  • [1] Revisiting Tibetan Word Segmentation with Neural Networks
    Duanzhu, Sangjie
    Jiacuo, Cizhen
    Jia, Cairang
    CHINESE LEXICAL SEMANTICS (CLSW 2020), 2021, 12278 : 515 - 524
  • [2] Tibetan Word Segmentation Based on Word-position Tagging
    Kang, Caijun
    Jiang, Di
    Long, Congjun
    2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 239 - 242
  • [3] A Resolution of Overlapping Ambiguity in Tibetan Word Segmentation
    Sun, Yuan
    Yan, Xiaodong
    Zhao, Xiaobing
    Yang, Guosheng
    PROCEEDINGS OF 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (ICCSIT 2010), VOL 8, 2010, : 222 - 225
  • [4] Study on Tibetan Word Segmentation as Syllable Tagging
    Li, Yachao
    Yu, Hongzhi
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2013, 2013, 400 : 363 - 369
  • [5] The Effect of Visual Word Segmentation Cues in Tibetan Reading
    Wang, Danhui
    Niu, Dingyi
    Li, Tianzhi
    Gao, Xiaolei
    BRAIN SCIENCES, 2024, 14 (10)
  • [6] Ancient Tibetan Word Segmentation based on Deep Learning
    An, Bo
    Long, Congjun
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 292 - 297
  • [7] Tibetan number identification based on classification of number components in tibetan word segmentation
    Institute of Software, Graduate University of Chinese Academy of Sciences, Chinese Academy of Sciences, China
    不详
    不详
    Coling - Int. Conf. Comput. Linguist., Proc. Conf., (719-724):
  • [8] Tibetan Unknown Word Identification from News Corpora for Supporting Lexicon-based Tibetan Word Segmentation
    Nuo, Minghua
    Liu, Huidan
    Long, Congjun
    Wu, Jian
    PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 451 - 457
  • [9] Neural Word Segmentation with Rich Pretraining
    Yang, Jie
    Zhang, Yue
    Dong, Fei
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 839 - 849
  • [10] Neural Chinese word segmentation with dictionary
    Liu, Junxin
    Wu, Fangzhao
    Wu, Chuhan
    Huang, Yongfeng
    Xie, Xing
    NEUROCOMPUTING, 2019, 338 : 46 - 54