Word Sense Disambiguation Studio: A Flexible System for WSD Feature Extraction

被引:0
|
作者
Agre, Gennady [1 ]
Petrov, Daniel [2 ]
Keskinova, Simona [2 ]
机构
[1] Bulgarian Acad Sci, Inst Informat & Commun Technol, Sofia 1113, Bulgaria
[2] Tech Univ Sofia, Lab Comp Graph & Geog Informat Syst, Sofia 2173, Bulgaria
关键词
word sense disambiguation; word embedding; classification; neural networks; random forest; deep forest; JRip; KNOWLEDGE;
D O I
10.3390/info10030097
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper presents a flexible system for extracting features and creating training and test examples for solving the all-words sense disambiguation (WSD) task. The system allows integrating word and sense embeddings as part of an example description. The system possesses two unique features distinguishing it from all similar WSD systems-the ability to construct a special compressed representation for word embeddings and the ability to construct training and test sets of examples with different data granularity. The first feature allows generation of data sets with quite small dimensionality, which can be used for training highly accurate classifiers of different types. The second feature allows generating sets of examples that can be used for training classifiers specialized in disambiguating a concrete word, words belonging to the same part-of-speech (POS) category or all open class words. Intensive experimentation has shown that classifiers trained on examples created by the system outperform the standard baselines for measuring the behaviour of all-words WSD classifiers.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Japanese word sense disambiguation system based on deep feature extraction
    Lei, Xue-Mei
    Wang, Da-Liang
    Takaaki, Tanaka
    Zeng, Guang-Ping
    Beijing Keji Daxue Xuebao/Journal of University of Science and Technology Beijing, 2010, 32 (02): : 263 - 269
  • [2] Use of word sense disambiguation in an information extraction system
    IBM T. J. Watson Research Cent, Hawthorne, United States
    Proc Natl Conf Artif Intell, (850-855):
  • [3] The use of word sense disambiguation in an information extraction system
    Chai, JY
    Biermann, AW
    SIXTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-99)/ELEVENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE (IAAI-99), 1999, : 850 - 855
  • [4] Feature expansion for word sense disambiguation
    Tsao, NL
    Wible, D
    Kuo, CH
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 126 - 131
  • [5] MWE as WSD: Solving Multiword Expression Identification with Word Sense Disambiguation
    Tanner, Joshua
    Hoffman, Jacob
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 181 - 193
  • [6] TWE-WSD: An effective topical word embedding based word sense disambiguation
    Jia, Lianyin
    Tang, Jilin
    Li, Mengjuan
    You, Jinguo
    Ding, Jiaman
    Chen, Yinong
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2021, 6 (01) : 72 - 79
  • [7] Word Sense Disambiguation using WSD Specific WordNet of Polysemy Words
    Dhungana, Udaya Raj
    Shakya, Subarna
    Barap, Kabita
    Sharma, Bharat
    2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 148 - 152
  • [8] AMuSE-WSD: An All-in-one Multilingual System for Easy Word Sense Disambiguation
    Orlando, Riccardo
    Conia, Simone
    Brignone, Fabrizio
    Cecconi, Francesco
    Navigli, Roberto
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2021, : 298 - 307
  • [9] Word Sense Indicators: Effective Feature for Chinese Word Sense Disambiguation
    Quan, Changqin
    Ren, Fuji
    He, Tingting
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2009, 12 (05): : 1157 - 1164
  • [10] WSD-TIC: Word Sense Disambiguation Using Taxonomic Information Content
    Ben Aouicha, Mohamed
    Taieb, Mohamed Ali Hadj
    Ibn Marai, Hania
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2016, PT I, 2016, 9875 : 131 - 142