Applying class triggers in Chinese pos tagging based on maximum entropy model

被引:0
|
作者
Zhao, Y [1 ]
Wang, XL [1 ]
Liu, BQ [1 ]
Guan, Y [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
关键词
Chinese POS tagging; trigger; average mutual information; maximum entropy;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A method of applying class triggers in Chinese POS tagging based on Maximum Entropy model is proposed in this paper. First of all, Feature template of "word->word/tat" is used to extract the triggers from corpus and the triggers that we extracted are added into the Maximum Entropy model as a new kind of feature. Then, the average mutual information is applied to make feature selection and the semantic lexicon is used to build class triggers to overcome sparseness problem. Meanwhile, A solution based on experience to deal with over-fitting problem in model training is presented. Finally, the performance of the system is evaluated on a manually annotated POS tag corpus. The experiment demonstrates that the method can provide increase of accuracy of POS tagging from 94% to 96%, compared our new model with HMM model that is smoothed by absolute smoothing.
引用
收藏
页码:1641 / 1645
页数:5
相关论文
共 50 条
  • [21] Method of Chinese Named Entity Recognition Based on Maximum Entropy Model
    Ning Hui
    Yang Hua
    Tan Ya-zhou
    Wu Hao
    2009 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS 1-7, CONFERENCE PROCEEDINGS, 2009, : 2472 - 2477
  • [22] Fusion of word clustering features for tibetan part of speech tagging based on maximum entropy model
    Ma N.
    Li Y.
    He X.
    International Journal of Simulation: Systems, Science and Technology, 2016, 17 (08): : 19.1 - 19.5
  • [23] A Unified Model for Joint Chinese Word Segmentation and POS Tagging with Heterogeneous Annotation Corpora
    Zhao, Jiayi
    Qiu, Xipeng
    Huang, Xuanjing
    2013 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2013), 2013, : 227 - 230
  • [24] Implementing Chinese new word discovery and POS tagging based on support vector machine
    School of Computer Science, Fudan University, Shanghai 200433, China
    不详
    J. Comput. Inf. Syst., 2009, 3 (1279-1285):
  • [25] Chinese word sense disambiguation based on maximum entropy model with feature selection
    He J.-Z.
    Wang H.-F.
    Ruan Jian Xue Bao/Journal of Software, 2010, 21 (06): : 1287 - 1295
  • [26] A probabilistic feature based Maximum Entropy model for Chinese named entity recognition
    Zhang, Suxiang
    Wang, Xiaojie
    Wen, Juan
    Qin, Ying
    Zhong, Yixin
    COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 189 - +
  • [27] Using maximum entropy model for Chinese text categorization
    Li, RL
    Tao, XP
    Tang, L
    Hu, YF
    ADVANCED WEB TECHNOLOGIES AND APPLICATIONS, 2004, 3007 : 578 - 587
  • [28] Context-Based Bigram Model for POS Tagging in Hindi: A Heuristic Approach
    Bharti S.K.
    Gupta R.K.
    Patel S.
    Shah M.
    Annals of Data Science, 2024, 11 (01) : 347 - 378
  • [29] Phrase-Based Statistical Model for Korean Morpheme Segmentation and POS Tagging
    Na, Seung-Hoon
    Kim, Young-Kil
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (02): : 512 - 522
  • [30] Class Probability Distribution Based Maximum Entropy Model for Classification of Datasets with Sparse Instances
    Arumugam, Saravanan
    Damotharan, Anandhi
    Marudhachalam, Srividya
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2023, 20 (03) : 949 - 976