Semi-supervised maximum entropy based POS tagging for large scale Chinese corpus

被引:0
|
作者
Yuan, Caixia [1 ]
Wang, Xiaojie [1 ]
Zhai, Junjie [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat Engn, Beijing 100876, Peoples R China
关键词
semi-supervised; maximum entropy; Chinese POS tagging;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents issues related to POS tagging for large-scale Chinese corpus using the maximum entropy technique, in which unlabeled data are introduced for compensating the sparseness and inconsistency of labeled data. We test our method on the corpus of Peking University China and show that as much as 27% error reduction is obtained by semi-supervised strategy.
引用
收藏
页码:385 / 389
页数:5
相关论文
共 50 条
  • [21] Semi-supervised classification based on regularization of minimum entropy
    Liu X.-L.
    Hao Z.-F.
    Yang X.-W.
    Ma X.-H.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2010, 38 (01): : 87 - 91
  • [22] VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation
    Wang, Changhan
    Riviere, Morgane
    Lee, Ann
    Wu, Anne
    Talnikar, Chaitanya
    Haziza, Daniel
    Williamson, Mary
    Pino, Juan
    Dupoux, Emmanuel
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 993 - 1003
  • [23] Improving Persian POS Tagging Using the Maximum Entropy Model
    Kardan, Ahmad A.
    Imani, Maryam Bahojb
    2014 IRANIAN CONFERENCE ON INTELLIGENT SYSTEMS (ICIS), 2014,
  • [24] EFFICIENT LARGE SCALE SEMI-SUPERVISED LEARNING FOR CTC BASED ACOUSTIC MODELS
    Swarup, Prakhar
    Chakrabarty, Debmalya
    Sapru, Ashtosh
    Tulsiani, Hitesh
    Arsikere, Harish
    Garimella, Sri
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 148 - 155
  • [25] LARGE SCALE SEMI-SUPERVISED IMAGE SEGMENTATION WITH ACTIVE QUERIES
    Tuia, Devis
    Munoz-Mari, Jordi
    Camps-Valls, Gustavo
    2011 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2011, : 2653 - 2656
  • [26] Chinese part of speech tagging based on maximum entropy method
    Lin, H
    Yuan, CF
    2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 1447 - 1450
  • [27] Large-scale image recognition based on parallel kernel supervised and semi-supervised subspace learning
    Fei Wu
    Xiao-Yuan Jing
    Qian Liu
    Song-Song Wu
    Guo-Liang He
    Neural Computing and Applications, 2017, 28 : 483 - 498
  • [28] SEMI-SUPERVISED LEARNING BASED CHINESE DIALECT IDENTIFICATION
    Gu Mingliang
    Xia Yuguo
    Yang Yiming
    ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 1609 - +
  • [29] Large-scale image recognition based on parallel kernel supervised and semi-supervised subspace learning
    Wu, Fei
    Jing, Xiao-Yuan
    Liu, Qian
    Wu, Song-Song
    He, Guo-Liang
    NEURAL COMPUTING & APPLICATIONS, 2017, 28 (03): : 483 - 498
  • [30] Maximum margin based semi-supervised spectral kernel learning
    Xu, Zenglin
    Zhu, Jianke
    Lyu, Michael R.
    King, Irwin
    2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 418 - 423