Unsupervised word sense disambiguation with N-gram features

被引:0
|
作者
Daniel Preotiuc-Pietro
Florentina Hristea
机构
[1] University of Sheffield,Department of Computer Science
[2] University of Bucharest,Department of Computer Science
来源
关键词
Bayesian classification; The EM algorithm; Word sense disambiguation; Unsupervised disambiguation; Web-scale N-grams;
D O I
暂无
中图分类号
学科分类号
摘要
The present paper concentrates on the issue of feature selection for unsupervised word sense disambiguation (WSD) performed with an underlying Naïve Bayes model. It introduces web N-gram features which, to our knowledge, are used for the first time in unsupervised WSD. While creating features from unlabeled data, we are “helping” a simple, basic knowledge-lean disambiguation algorithm to significantly increase its accuracy as a result of receiving easily obtainable knowledge. The performance of this method is compared to that of others that rely on completely different feature sets. Test results concerning nouns, adjectives and verbs show that web N-gram feature selection is a reliable alternative to previously existing approaches, provided that a “quality list” of features, adapted to the part of speech, is used.
引用
收藏
页码:241 / 260
页数:19
相关论文
共 50 条
  • [11] Learning Sense Representation from Word Representation for Unsupervised Word Sense Disambiguation
    Wang, Jie
    Fu, Zhenxin
    Li, Moxin
    Zhang, Haisong
    Zhao, Dongyan
    Yan, Rui
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13947 - 13948
  • [12] Unsupervised Word Sense Disambiguation based on Word Embedding and Collocation
    Han, Shangzhuang
    Shirai, Kiyoaki
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 1218 - 1225
  • [13] Word sense disambiguation of Thai language with unsupervised learning
    Pongpinigpinyo, S
    Rivepiboon, W
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2005, 3681 : 1275 - 1283
  • [14] Graph Connectivity Measures for Unsupervised Word Sense Disambiguation
    Navigli, Roberto
    Lapata, Mirella
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1683 - 1688
  • [15] Unsupervised word sense disambiguation using WordNet relatives
    Seo, HC
    Chung, HJ
    Rim, HC
    Myaeng, SH
    Kim, SH
    COMPUTER SPEECH AND LANGUAGE, 2004, 18 (03): : 253 - 273
  • [16] The Noisy Channel Mode for Unsupervised Word Sense Disambiguation
    Yuret, Deniz
    Yatbaz, Mehmet Ali
    COMPUTATIONAL LINGUISTICS, 2010, 36 (01) : 111 - 127
  • [17] Unsupervised Korean Word Sense Disambiguation using CoreNet
    Han, Kijong
    Nam, Sangha
    Kim, Jiseong
    Hahm, Younggyun
    Choi, Key-Sun
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1023 - 1026
  • [18] Exploration of N-gram Features for the Domain Adaptation of Chinese Word Segmentation
    Guo, Zhen
    Zhang, Yujie
    Su, Chen
    Xu, Jinan
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, 2012, 333 : 121 - 131
  • [19] Significance of syntactic features for Word Sense Disambiguation
    Kanth, AS
    Murthy, KN
    ADVANCES IN NATURAL LANGUAGE PROCESSING, 2004, 3230 : 340 - 348
  • [20] Word Sense Disambiguation Features for Taxonomy Extraction
    Alexeyevsky, Daniil
    COMPUTACION Y SISTEMAS, 2018, 22 (03): : 871 - 880