Corpus based part-of-speech tagging

被引:6
|
作者
Lv, Chengyao [1 ]
Liu, Huihua [1 ]
Dong, Yuanxing [1 ]
Chen, Yunliang [1 ,2 ]
机构
[1] China Univ Geosci, Sch Foreign Language, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Natural language processing; POS tagging; Hidden markov models; Support vector machine; Neural networks; Gene expression programming;
D O I
10.1007/s10772-016-9356-2
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In natural language processing, a crucial subsystem in a wide range of applications is a part-of-speech (POS) tagger, which labels (or classifies) unannotated words of natural language with POS labels corresponding to categories such as noun, verb or adjective. Mainstream approaches are generally corpus-based: a POS tagger learns from a corpus of pre-annotated data how to correctly tag unlabeled data. Presented here is a brief state-of-the-art account on POS tagging. POS tagging approaches make use of labeled corpus to train computational trained models. Several typical models of three kings of tagging are introduced in this article: rule-based tagging, statistical approaches and evolution algorithms. The advantages and the pitfalls of each typical tagging are discussed and analyzed. Some rule-based and stochastic methods have been successfully achieved accuracies of 93-96 %, while that of some evolution algorithms are about 96-97 %.
引用
收藏
页码:647 / 654
页数:8
相关论文
共 50 条
  • [21] A CONNECTIONIST APPROACH TO PART-OF-SPEECH TAGGING
    Zamora-Martinez, F.
    Castro-Bleda, M. J.
    Espana-Boquera, S.
    Tortajada, Salvador
    Aibar, P.
    IJCCI 2009: PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2009, : 421 - +
  • [22] Part-of-speech tagging and partial parsing
    Abney, S
    CORPUS-BASED METHODS IN LANGUAGE AND SPEECH PROCESSING, 1997, 2 : 118 - 136
  • [23] Morphological Analysis Based Part-of-Speech Tagging for Uyghur Speech Synthesis
    Mamateli, Guljamal
    Rozi, Askar
    Ali, Gulnar
    Hamdulla, Askar
    KNOWLEDGE ENGINEERING AND MANAGEMENT, 2011, 123 : 389 - +
  • [24] Part of Speech Tagging - A Corpus Based Approach
    Rashmi, S.
    Hanumanthappa, M.
    SMART TRENDS IN INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS, SMARTCOM 2016, 2016, 628 : 88 - 96
  • [25] Part-of-speech Tagging Based on Dictionary and Statistical Machine Learning
    Ye Zhonglin
    Jia Zhen
    Huang Junfu
    Yin Hongfeng
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 6993 - 6998
  • [26] A novel approach to part-of-speech tagging based on latent analogy
    Bellegarda, Jerome R.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4685 - 4688
  • [27] Transformation-based part-of-speech tagging for Serbian language
    Delic, Vlado
    Secujski, Milan
    Kupusinac, Aleksandar
    PROCEEDINGS OF THE 8TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS (CIMMACS '09), 2009, : 98 - +
  • [28] Building a Corpus from Handwritten Picture Postcards: Transcription, Annotation and Part-of-Speech Tagging
    Sugisaki, Kyoko
    Wiedmer, Nicolas
    Hausendorf, Heiko
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 255 - 259
  • [29] Property-based Test for Part-of-Speech Tagging Tool
    Jin, Shuo
    Chen, Songqiang
    Xie, Xiaoyuan
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 1306 - 1311
  • [30] The computational complexity of rule-based part-of-speech tagging
    Oliva, K
    Kveton, P
    Ondruska, R
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 82 - 89