A comparison of graph-based word sense induction clustering algorithms in a pseudoword evaluation framework

被引:0
|
作者
Flavio Massimiliano Cecchini
Martin Riedl
Elisabetta Fersini
Chris Biemann
机构
[1] Università degli Studi di Milano - Bicocca,DISCo
[2] Universität Hamburg,Informatikum
来源
Language Resources and Evaluation | 2018年 / 52卷
关键词
Word sense induction; Graph clustering; Pseudowords; Evaluation;
D O I
暂无
中图分类号
学科分类号
摘要
This article presents a comparison of different Word Sense Induction (wsi) clustering algorithms on two novel pseudoword data sets of semantic-similarity and co-occurrence-based word graphs, with a special focus on the detection of homonymic polysemy. We follow the original definition of a pseudoword as the combination of two monosemous terms and their contexts to simulate a polysemous word. The evaluation is performed comparing the algorithm’s output on a pseudoword’s ego word graph (i.e., a graph that represents the pseudoword’s context in the corpus) with the known subdivision given by the components corresponding to the monosemous source words forming the pseudoword. The main contribution of this article is to present a self-sufficient pseudoword-based evaluation framework for wsi graph-based clustering algorithms, thereby defining a new evaluation measure (top2) and a secondary clustering process (hyperclustering). To our knowledge, we are the first to conduct and discuss a large-scale systematic pseudoword evaluation targeting the induction of coarse-grained homonymous word senses across a large number of graph clustering algorithms.
引用
收藏
页码:733 / 770
页数:37
相关论文
共 50 条
  • [1] A comparison of graph-based word sense induction clustering algorithms in a pseudoword evaluation framework
    Cecchini, Flavio Massimiliano
    Riedl, Martin
    Fersini, Elisabetta
    Biemann, Chris
    LANGUAGE RESOURCES AND EVALUATION, 2018, 52 (03) : 733 - 770
  • [2] Clustering and DiversifyingWeb Search Results with Graph-Based Word Sense Induction
    Di Marco, Antonio
    Navigli, Roberto
    COMPUTATIONAL LINGUISTICS, 2013, 39 (03) : 709 - 754
  • [3] Synonymy Graph Connectivity in Graph-Based Word Sense Induction
    Chernoskutov, Mikhail
    Ustalov, Dmitry
    2017 SIBERIAN SYMPOSIUM ON DATA SCIENCE AND ENGINEERING (SSDSE), 2017, : 14 - 17
  • [4] Benchmarking graph-based clustering algorithms
    Foggia, P.
    Percannella, G.
    Sansone, C.
    Vento, M.
    IMAGE AND VISION COMPUTING, 2009, 27 (07) : 979 - 988
  • [5] Word Sense Discrimination on Tweets: A Graph-based Approach
    Cecchini, Flavio Massimiliano
    Fersini, Elisabetta
    Messina, Enza
    2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 138 - 146
  • [6] Graph-Based Induction of Word Senses in Croatian
    Bekavac, Marko
    Snajder, Jan
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3014 - 3018
  • [7] Graph-based Word Sense Disambiguation of biomedical documents
    Agirre, Eneko
    Soroa, Aitor
    Stevenson, Mark
    BIOINFORMATICS, 2010, 26 (22) : 2889 - 2896
  • [8] Graph-based word sense disambiguation in Telugu language
    Koppula, Neeraja
    Rani, B. Padmaja
    Rao, Koppula Srinivas
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2019, 23 (01) : 55 - 60
  • [9] A Large-Scale Pseudoword-Based Evaluation Framework for State-of-the-Art Word Sense Disambiguation
    Pilehvar, Mohammad Taher
    Navigli, Roberto
    COMPUTATIONAL LINGUISTICS, 2014, 40 (04) : 837 - 881
  • [10] Context expansion approach for graph-based word sense disambiguation
    Abdalgader, Khaled
    Al Shibli, Aysha
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 168