Semi-supervised maximum entropy based POS tagging for large scale Chinese corpus

被引:0
|
作者
Yuan, Caixia [1 ]
Wang, Xiaojie [1 ]
Zhai, Junjie [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Informat Engn, Beijing 100876, Peoples R China
关键词
semi-supervised; maximum entropy; Chinese POS tagging;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents issues related to POS tagging for large-scale Chinese corpus using the maximum entropy technique, in which unlabeled data are introduced for compensating the sparseness and inconsistency of labeled data. We test our method on the corpus of Peking University China and show that as much as 27% error reduction is obtained by semi-supervised strategy.
引用
收藏
页码:385 / 389
页数:5
相关论文
共 50 条
  • [41] Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation
    Ravi, Sujith
    Diao, Qiming
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 519 - 528
  • [42] Semi-supervised Hashing with Semantic Confidence for Large Scale Visual Search
    Pan, Yingwei
    Yao, Ting
    Li, Houqiang
    Ngo, Chong-Wah
    Mei, Tao
    SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 53 - 62
  • [43] SSDH: Semi-Supervised Deep Hashing for Large Scale Image Retrieval
    Zhang, Jian
    Peng, Yuxin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (01) : 212 - 225
  • [44] Semi-supervised Vision Transformers at Scale
    Cai, Zhaowei
    Ravichandran, Avinash
    Favaro, Paolo
    Wang, Manchen
    Modolo, Davide
    Bhotika, Rahul
    Tu, Zhuowen
    Soatto, Stefano
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [45] Semi-Supervised Chinese Compound Word Extraction Based on HMM
    He, Hui
    Chen, Bo
    Guo, Jun
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 2077 - 2081
  • [46] Chinese Short Text Categorization Based on Semi-Supervised Learning
    Ma, Jie
    Xiong, Zhong-Yang
    Zhang, Yu-Fang
    Wang, Liu-Qian
    Xie, Jiang
    3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND MECHANICAL AUTOMATION (CSMA 2017), 2017, : 45 - 54
  • [47] Semi-supervised categorization of documents using the Web as corpus
    Guzman Cabrera, Rafael
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2011, (46): : 127 - 128
  • [48] Large margin semi-supervised learning
    Wang, Junhui
    Shen, Xiaotong
    JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 1867 - 1891
  • [49] Semi-Supervised Technical Term Tagging With Minimal User Feedback
    QasemiZadeh, Behrang
    Buitelaar, Paul
    Chen, Tianqi
    Bordea, Georgeta
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 617 - 621
  • [50] Semi-supervised Part-of-speech Tagging in Speech Applications
    Dufour, Richard
    Favre, Benoit
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1373 - 1376