Corpus based part-of-speech tagging

被引:6
|
作者
Lv, Chengyao [1 ]
Liu, Huihua [1 ]
Dong, Yuanxing [1 ]
Chen, Yunliang [1 ,2 ]
机构
[1] China Univ Geosci, Sch Foreign Language, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Natural language processing; POS tagging; Hidden markov models; Support vector machine; Neural networks; Gene expression programming;
D O I
10.1007/s10772-016-9356-2
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In natural language processing, a crucial subsystem in a wide range of applications is a part-of-speech (POS) tagger, which labels (or classifies) unannotated words of natural language with POS labels corresponding to categories such as noun, verb or adjective. Mainstream approaches are generally corpus-based: a POS tagger learns from a corpus of pre-annotated data how to correctly tag unlabeled data. Presented here is a brief state-of-the-art account on POS tagging. POS tagging approaches make use of labeled corpus to train computational trained models. Several typical models of three kings of tagging are introduced in this article: rule-based tagging, statistical approaches and evolution algorithms. The advantages and the pitfalls of each typical tagging are discussed and analyzed. Some rule-based and stochastic methods have been successfully achieved accuracies of 93-96 %, while that of some evolution algorithms are about 96-97 %.
引用
收藏
页码:647 / 654
页数:8
相关论文
共 50 条
  • [31] Part-of-speech tagging with recurrent neural networks
    Pérez-Ortiz, JA
    Forcada, ML
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1588 - 1592
  • [32] Impact of imperfect OCR on part-of-speech tagging
    Lin, XF
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 284 - 288
  • [33] Analyzing Tagging Accuracy of Part-of-Speech Taggers
    Khin, Nyein Pyae Pyae
    Aung, Than Nwe
    GENETIC AND EVOLUTIONARY COMPUTING, VOL II, 2016, 388 : 347 - 354
  • [34] High performance part-of-speech tagging of Bulgarian
    Doychinova, V
    Mihov, S
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, 2004, 3192 : 246 - 255
  • [35] Dual Decomposition for Vietnamese Part-of-Speech Tagging
    Bach, Ngo Xuan
    Hiraishi, Kunihiko
    Le Minh, Nguyen
    Shimazu, Akira
    17TH INTERNATIONAL CONFERENCE IN KNOWLEDGE BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS - KES2013, 2013, 22 : 123 - 131
  • [36] Part-of-speech tagging using genetic algorithms
    Department of Computer Science and Engineering, Lovely Professional University, Jalandhar
    Punjab, India
    Int. J. Simul. Syst. Sci. Technol., 6 (11.1-11.7):
  • [37] On Certain Aspects of Kazakh Part-of-Speech Tagging
    Makazhanov, Aibek
    Yessenbayev, Zhandos
    Sabyrgaliyev, Islam
    Sharafudinov, Anuar
    Makhambetov, Olzhas
    2014 IEEE 8TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2014, : 240 - 243
  • [38] Part-of-Speech Tagging Using Evolutionary Computation
    Silva, Ana Paula
    Silva, Arlindo
    Rodrigues, Irene
    NATURE INSPIRED COOPERATIVE STRATEGIES FOR OPTIMIZATION (NICSO 2013), 2014, 512 : 167 - +
  • [39] Part-of-Speech (POS) Tagging for the Nyishi Language
    Siram, Joyir
    Sambyo, Koj
    Sarkar, Achyuth
    ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY AND COMPUTING, AICTC 2021, 2022, 392 : 191 - 199
  • [40] Part-of-speech tagging for table of contents recognition
    Belaïd, A
    Pierron, L
    Valverde, N
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 451 - 454