Unsupervised word sense disambiguation and rules extraction using non-aligned bilingual corpus

被引:0
|
作者
Oliveira, F [1 ]
Wong, F [1 ]
Li, YP [1 ]
Zheng, J [1 ]
机构
[1] Univ Macau, Fac Sci & Technol, Macao, Peoples R China
关键词
word sense disambiguation; natural language processing; machine translation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This. paper presents a statistical Word Sense Disambiguation with application in Portuguese-Chinese Machine Translation systems.. Due to the limited availability of Portuguese-Chinese resources in the form of digital corpora and annotated Treebank, an unsupervised learning and a non-aligned bilingual corpus are applied. The proposed method first identifies words related to each of the ambiguous words based on their surrounding words and relative distance. A mathematical model is then applied in the identification of the most suitable sense of an ambiguous word in terms of the related words. All the senses discovered are converted into a set of rules and stored in the Sense Knowledge base for later use in disambiguation and translation process. Preliminary experiment results show an improvement of 6% in assigning correctly the corresponding translation over the baseline method.
引用
收藏
页码:30 / 35
页数:6
相关论文
共 50 条
  • [31] Semi-supervised Word Sense Disambiguation Using the Web as Corpus
    Guzman-Cabrera, Rafael
    Rosso, Paolo
    Montes-y-Gomez, Manuel
    Villasenor-Pineda, Luis
    Pinto-Avendano, David
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2009, 5449 : 256 - +
  • [32] Word sense disambiguation of Thai language with unsupervised learning
    Pongpinigpinyo, S
    Rivepiboon, W
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2005, 3681 : 1275 - 1283
  • [33] Graph Connectivity Measures for Unsupervised Word Sense Disambiguation
    Navigli, Roberto
    Lapata, Mirella
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1683 - 1688
  • [34] The Noisy Channel Mode for Unsupervised Word Sense Disambiguation
    Yuret, Deniz
    Yatbaz, Mehmet Ali
    COMPUTATIONAL LINGUISTICS, 2010, 36 (01) : 111 - 127
  • [35] Topic Modeling and Word Sense Disambiguation on the Ancora corpus
    Izquierdo, Ruben
    Postma, Marten
    Vossen, Piek
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2015, (55): : 15 - 22
  • [36] Sense Unveiled: Enhancing Urdu Corpus for Nuanced Word Sense Disambiguation
    Bibi, Sarfraz
    Asghar, Sohail
    Zubair, Muhammad
    IEEE ACCESS, 2024, 12 : 126329 - 126343
  • [37] Unsupervised Word Sense Disambiguation Using Markov Random Field and Dependency Parser
    Chaplot, Devendra Singh
    Bhattacharyya, Pushpak
    Paranjape, Ashwin
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2217 - 2223
  • [38] Word Sense Disambiguation by Information Filtering and Extraction
    Jeremy Ellman
    Ian Klincke
    John Tait
    Computers and the Humanities, 2000, 34 : 127 - 134
  • [39] Word Sense Disambiguation Features for Taxonomy Extraction
    Alexeyevsky, Daniil
    COMPUTACION Y SISTEMAS, 2018, 22 (03): : 871 - 880
  • [40] Unsupervised graph-based word sense disambiguation using measures of word semantic similarity
    Sinha, Ravi
    Mihalcea, Rada
    ICSC 2007: INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, PROCEEDINGS, 2007, : 363 - +