Unsupervised word sense disambiguation with N-gram features

被引:0
|
作者
Daniel Preotiuc-Pietro
Florentina Hristea
机构
[1] University of Sheffield,Department of Computer Science
[2] University of Bucharest,Department of Computer Science
来源
关键词
Bayesian classification; The EM algorithm; Word sense disambiguation; Unsupervised disambiguation; Web-scale N-grams;
D O I
暂无
中图分类号
学科分类号
摘要
The present paper concentrates on the issue of feature selection for unsupervised word sense disambiguation (WSD) performed with an underlying Naïve Bayes model. It introduces web N-gram features which, to our knowledge, are used for the first time in unsupervised WSD. While creating features from unlabeled data, we are “helping” a simple, basic knowledge-lean disambiguation algorithm to significantly increase its accuracy as a result of receiving easily obtainable knowledge. The performance of this method is compared to that of others that rely on completely different feature sets. Test results concerning nouns, adjectives and verbs show that web N-gram feature selection is a reliable alternative to previously existing approaches, provided that a “quality list” of features, adapted to the part of speech, is used.
引用
收藏
页码:241 / 260
页数:19
相关论文
共 50 条
  • [31] Web-Scale N-gram Models for Lexical Disambiguation
    Bergsma, Shane
    Lin, Dekang
    Goebel, Randy
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1507 - 1512
  • [32] DETERMINING EFFECTIVE FEATURES FOR WORD SENSE DISAMBIGUATION IN TURKISH
    Orhan, Zeynep
    Altan, Zeynep
    ISTANBUL UNIVERSITY-JOURNAL OF ELECTRICAL AND ELECTRONICS ENGINEERING, 2005, 5 (02): : 1341 - 1352
  • [33] Word Sense Disambiguation in Bengali language using unsupervised methodology with modifications
    Alok Ranjan Pal
    Diganta Saha
    Sādhanā, 2019, 44
  • [34] State of the art versus classical clustering for unsupervised word sense disambiguation
    Popescu, Marius
    Hristea, Florentina
    ARTIFICIAL INTELLIGENCE REVIEW, 2011, 35 (03) : 241 - 264
  • [35] Combining Supervised and Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation
    E. Agirre
    G. Rigau
    L. Padró
    J. Atserias
    Computers and the Humanities, 2000, 34 : 103 - 108
  • [36] Improving Subjectivity Detection using Unsupervised Subjectivity Word Sense Disambiguation
    Ortega, Reynier
    Fonseca, Adrian
    Gutierrez, Yoan
    Montoyo, Andres
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2013, (51): : 179 - 186
  • [37] A semantics-enhanced language model for unsupervised word sense disambiguation
    Lin, Shou-De
    Verspoor, Karin
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2008, 4919 : 287 - +
  • [38] Unsupervised word-sense disambiguation using bilingual comparable corpora
    Kaji, H
    Morimoto, Y
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (02) : 289 - 301
  • [39] Unsupervised Translated Word Sense Disambiguation in Constructing Bilingual Lexical Database
    Lynn, Htet Myet
    Choi, Chang
    Kim, Pankoo
    33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 1824 - 1827
  • [40] State of the art versus classical clustering for unsupervised word sense disambiguation
    Marius Popescu
    Florentina Hristea
    Artificial Intelligence Review, 2011, 35 : 241 - 264