Corpus based part-of-speech tagging

被引:6
|
作者
Lv, Chengyao [1 ]
Liu, Huihua [1 ]
Dong, Yuanxing [1 ]
Chen, Yunliang [1 ,2 ]
机构
[1] China Univ Geosci, Sch Foreign Language, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Natural language processing; POS tagging; Hidden markov models; Support vector machine; Neural networks; Gene expression programming;
D O I
10.1007/s10772-016-9356-2
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In natural language processing, a crucial subsystem in a wide range of applications is a part-of-speech (POS) tagger, which labels (or classifies) unannotated words of natural language with POS labels corresponding to categories such as noun, verb or adjective. Mainstream approaches are generally corpus-based: a POS tagger learns from a corpus of pre-annotated data how to correctly tag unlabeled data. Presented here is a brief state-of-the-art account on POS tagging. POS tagging approaches make use of labeled corpus to train computational trained models. Several typical models of three kings of tagging are introduced in this article: rule-based tagging, statistical approaches and evolution algorithms. The advantages and the pitfalls of each typical tagging are discussed and analyzed. Some rule-based and stochastic methods have been successfully achieved accuracies of 93-96 %, while that of some evolution algorithms are about 96-97 %.
引用
收藏
页码:647 / 654
页数:8
相关论文
共 50 条
  • [1] Part-of-speech tagging
    Martinez, Angel R.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2012, 4 (01): : 107 - 113
  • [2] Development of a pediatric text-corpus for part-of-speech tagging
    Pestian, J
    Itert, L
    Duch, W
    INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2004, : 219 - 226
  • [3] Phrase-based part-of-speech tagging
    Finch, Andrew
    Sumita, Eiichiro
    PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 215 - +
  • [4] Part-of-speech tagging for Swedish
    Prütz, K
    PARALLEL CORPORA, PARALLEL WORLDS, 2002, (43): : 201 - 206
  • [5] Part-of-Speech and Pragmatic Tagging of a Corpus of Film Dialogue: A Pilot Study
    Liviana Galiano
    Alfonso Semeraro
    Corpus Pragmatics, 2023, 7 : 17 - 39
  • [6] Part-of-Speech and Pragmatic Tagging of a Corpus of Film Dialogue: A Pilot Study
    Galiano, Liviana
    Semeraro, Alfonso
    CORPUS PRAGMATICS, 2023, 7 (01) : 17 - 39
  • [7] Part-of-speech tagging based on machine translation techniques
    Gasco i Mora, Guillem
    Sanchez Peiro, Joan Andreu
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 1, PROCEEDINGS, 2007, 4477 : 257 - +
  • [8] Dictionary-based part-of-speech tagging of Polish
    Galus, S
    Intelligent Information Processing and Web Mining, Proceedings, 2005, : 179 - 188
  • [9] Korean part-of-speech tagging based on context information
    An, YM
    Lim, HD
    Seo, YH
    ISIE 2001: IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS PROCEEDINGS, VOLS I-III, 2001, : 334 - 337
  • [10] Memory-based engine for part-of-speech tagging
    Zhou, Li-na
    Chinese Journal of Advanced Software Research, 1998, 5 (02): : 176 - 188