Unsupervised word sense disambiguation with N-gram features

被引:0
|
作者
Daniel Preotiuc-Pietro
Florentina Hristea
机构
[1] University of Sheffield,Department of Computer Science
[2] University of Bucharest,Department of Computer Science
来源
关键词
Bayesian classification; The EM algorithm; Word sense disambiguation; Unsupervised disambiguation; Web-scale N-grams;
D O I
暂无
中图分类号
学科分类号
摘要
The present paper concentrates on the issue of feature selection for unsupervised word sense disambiguation (WSD) performed with an underlying Naïve Bayes model. It introduces web N-gram features which, to our knowledge, are used for the first time in unsupervised WSD. While creating features from unlabeled data, we are “helping” a simple, basic knowledge-lean disambiguation algorithm to significantly increase its accuracy as a result of receiving easily obtainable knowledge. The performance of this method is compared to that of others that rely on completely different feature sets. Test results concerning nouns, adjectives and verbs show that web N-gram feature selection is a reliable alternative to previously existing approaches, provided that a “quality list” of features, adapted to the part of speech, is used.
引用
收藏
页码:241 / 260
页数:19
相关论文
共 50 条
  • [21] A Corpus Based Unsupervised Bangla Word Stemming Using N-Gram Language Model
    Urmi, Tapashee Tabassum
    Jammy, Jasmine Jahan
    Ismail, Sabir
    2016 5TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS AND VISION (ICIEV), 2016, : 824 - 828
  • [22] Unsupervised Hindi Word Sense Disambiguation based on Network Agglomeration
    Jain, Amita
    Lobiyal, D. K.
    2015 2ND INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2015, : 195 - 200
  • [23] The optimization of gibbs sampling model in unsupervised word sense disambiguation
    Li, Xu
    Shen, Lan
    Yao, Chunlong
    Yu, Xiaoqiang
    ICIC Express Letters, Part B: Applications, 2012, 3 (04): : 861 - 868
  • [24] Unsupervised bilingual word sense disambiguation using Web statistics
    Wang, Y
    Hoffmann, A
    AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 1167 - 1172
  • [25] A clustering-based Approach for Unsupervised Word Sense Disambiguation
    Martin-Wanton, Tamara
    Berlanga-Llavori, Rafael
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2012, (49): : 49 - 56
  • [26] Research on dual pattern of unsupervised and supervised Word Sense Disambiguation
    Wang, Yao-Feng
    Zhang, Yue-Jie
    Xu, Zhi-Ting
    Zhang, Tao
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 2665 - +
  • [27] AN IMPROVED UNSUPERVISED LEARNING PROBABILISTIC MODEL OF WORD SENSE DISAMBIGUATION
    Li, Xu
    Zhao, Xiuyan
    Ban, Fenglong
    Liu, Bai
    PROCEEDINGS OF THE 2012 WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES, 2012, : 1071 - 1075
  • [28] Combining unsupervised lexical knowledge methods for word sense disambiguation
    Rigau, G
    Atserias, J
    Agirre, E
    35TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 8TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 1997, : 48 - 55
  • [29] Graph Connectivity for Unsupervised Word Sense Disambiguation for HINDI Language
    Nandanwar, Lokesh
    2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
  • [30] An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation
    Navigli, Roberto
    Lapata, Mirella
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (04) : 678 - 692