A trigram statistical language model algorithm for chinese word segmentation

被引:0
|
作者
Mao, Jun [1 ]
Cheng, Gang [1 ]
He, Yanxiang [1 ]
Xing, Zehuan [2 ]
机构
[1] Wuhan Univ, Comp Sch, Wuhan 430072, Peoples R China
[2] Cent China Normal Univ, Dept Linguist, Wuhan 430079, Peoples R China
来源
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We address the problem of segmenting a Chinese text into words. In this paper, we propose a trigram model algorithm for segmenting a Chinese text. We also discuss why statistical language model is appropriate to be applied to Chinese word segmentation and give an algorithm for segmenting a Chinese text into words. In particular, we solve the problem of searching which often leads to low performance brought by trigram model. Finally, the issue of OOV word identification is discussed and merged to trigram model based method in order to improve the accuracy of segmentation.
引用
收藏
页码:271 / +
页数:3
相关论文
共 50 条
  • [1] Mongolian word segmentation based on statistical language model
    Hou, Hong-Xu
    Liu, Qun
    Nasanurtu
    Murengaowa
    Li, Jin-Tao
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2009, 22 (01): : 108 - 112
  • [2] A Chinese word dividing algorithm based on statistical language models
    Tian, B
    Cheung, J
    Yi, KC
    Wang, H
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 805 - 808
  • [3] Smoothing algorithm for the task adaptation Chinese Trigram model
    Jiang, Minghu
    Yuan, Baozong
    Lin, Biqin
    Tang, Xiaofang
    International Conference on Signal Processing Proceedings, ICSP, 1998, 1 : 738 - 741
  • [4] A smoothing algorithm for the task adaptation Chinese Trigram model
    Jiang, MH
    Yuan, BZ
    Lin, BQ
    Tang, XF
    ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 738 - 741
  • [5] Word segmentation in Chinese language processing
    Shu, Xinxin
    Wang, Junhui
    Shen, Xiaotong
    Qu, Annie
    Statistics and Its Interface, 2017, 10 (02) : 165 - 173
  • [6] Models and algorithm of Chinese word segmentation
    Wang, X
    Fu, G
    Yeung, DS
    Liu, JNK
    Luk, R
    IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 1279 - 1284
  • [7] Research on word segmentation for Chinese sign language
    Cheng, Yinchao
    Yin, Baocai
    Sun, Yanfeng
    PACLIC 20 - Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation, 2006, : 407 - 413
  • [8] Research on word segmentation for Chinese sign language
    Cheng, Yinchao
    Yin, Baocai
    Sun, Yanfeng
    PACLIC 20: PROCEEDINGS OF THE 20TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2006, : 407 - 413
  • [9] Chinese to Braille Translation Based on Braille Word Segmentation Using Statistical Model
    王向东
    杨阳
    张金超
    姜文斌
    刘宏
    钱跃良
    JournalofShanghaiJiaotongUniversity(Science), 2017, 22 (01) : 82 - 86
  • [10] Chinese to Braille translation based on Braille word segmentation using statistical model
    Wang X.
    Yang Y.
    Zhang J.
    Jiang W.
    Liu H.
    Qian Y.
    Wang, Xiangdong (xdwang@ict.ac.cn), 1600, Shanghai Jiaotong University (22): : 82 - 86