A hybrid model for sense guessing of Chinese unknown words

被引:0
|
作者
Department of Chinese Language and Literature, Peking University, China [1 ]
不详 [2 ]
机构
关键词
Natural language processing systems;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a hybrid model to address the task of sense guessing for Chinese unknown words. Three types of similarity, i.e., positional, syntactic and semantic similarity, are analyzed; and three models are developed accordingly. Then the three models are combined to form a hybrid one (HPPS Model). To verify the effectiveness and consistency of HPPS, experiments were conducted on ten test sets which were collected from two popular Chinese thesauruses Cilin and HowNet. In addition, extra experiments were made on a test set of 2000 words which were collected from newspaper. The experiments show that HPPS Model consistently produces 4%~6% F-score improvement over the best results reported in previous researches. © 2009 by Likun Qiu, Kai Zhao, and Changjian Hu.
引用
收藏
相关论文
共 50 条
  • [1] Guessing the Meaning of Unknown Words While Reading
    YIN Zhao rong LU Lei English Department Zaozhuang Teachers CollegeZaozhuang ChinaSecondery light in the industry of Zaozhuang CityZaozhuang China
    枣庄师专学报, 2001, (03) : 75 - 77
  • [2] A METHOD OF PART-OF-SPEECH GUESSING OF CHINESE UNKNOWN WORDS BASED ON COMBINED FEATURES
    Zhang, Hai-Jun
    Shi, Shu-Min
    Feng, Chong
    Huang, He-Yan
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 328 - +
  • [3] Hybrid approach for Khmer unknown word POS guessing
    Nou, Chenda
    Kameyama, Wataru
    IRI 2007: PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2007, : 215 - +
  • [4] Guessing Parts-of-Speech of Unknown Words Using Global Information
    Nakagawa, Tetsuji
    Matsumoto, Yuji
    COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 705 - 712
  • [5] Chinese POS Disambiguation and Unknown Word Guessing with Lexicalized HMMs
    Fu, Guohong
    Luke, Kang-Kwong
    INTERNATIONAL JOURNAL OF TECHNOLOGY AND HUMAN INTERACTION, 2006, 2 (01) : 39 - 50
  • [6] A Hybrid Model for Chinese Confusable Words Distinguishing in Proofreading
    Li, Luozheng
    Song, Peipei
    Zhang, Dan
    Zhao, Dongyan
    CHINESE LEXICAL SEMANTICS, CLSW 2021, PT I, 2022, 13249 : 464 - 473
  • [7] Chinese Unknown Words Extraction for Incomplete Sentences
    Chen, Yi-Hui
    Lu, Eric Jui-Lin
    Huang, Jeng-Jie
    International Journal of Network Security, 2022, 24 (04) : 755 - 764
  • [8] An iterative method for extracting Chinese unknown words
    He, S
    Zhu, J
    CHINESE JOURNAL OF ELECTRONICS, 2001, 10 (04): : 461 - 464
  • [9] Segmenting Chinese unknown words by heuristic method
    Yang, CC
    Li, KW
    DIGITAL LIBRARIES: TECHNOLOGY AND MANAGEMENT OF INDIGENOUS KNOWLEDGE FOR GLOBAL ACCESS, 2003, 2911 : 510 - 520
  • [10] A probabilistic model for guessing base forms of new words by analogy
    Linden, Krister
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2008, 4919 : 106 - 116