A comparison of graph-based word sense induction clustering algorithms in a pseudoword evaluation framework

被引:0
|
作者
Flavio Massimiliano Cecchini
Martin Riedl
Elisabetta Fersini
Chris Biemann
机构
[1] Università degli Studi di Milano - Bicocca,DISCo
[2] Universität Hamburg,Informatikum
来源
Language Resources and Evaluation | 2018年 / 52卷
关键词
Word sense induction; Graph clustering; Pseudowords; Evaluation;
D O I
暂无
中图分类号
学科分类号
摘要
This article presents a comparison of different Word Sense Induction (wsi) clustering algorithms on two novel pseudoword data sets of semantic-similarity and co-occurrence-based word graphs, with a special focus on the detection of homonymic polysemy. We follow the original definition of a pseudoword as the combination of two monosemous terms and their contexts to simulate a polysemous word. The evaluation is performed comparing the algorithm’s output on a pseudoword’s ego word graph (i.e., a graph that represents the pseudoword’s context in the corpus) with the known subdivision given by the components corresponding to the monosemous source words forming the pseudoword. The main contribution of this article is to present a self-sufficient pseudoword-based evaluation framework for wsi graph-based clustering algorithms, thereby defining a new evaluation measure (top2) and a secondary clustering process (hyperclustering). To our knowledge, we are the first to conduct and discuss a large-scale systematic pseudoword evaluation targeting the induction of coarse-grained homonymous word senses across a large number of graph clustering algorithms.
引用
收藏
页码:733 / 770
页数:37
相关论文
共 50 条
  • [41] Graph Based Word Sense Disambiguation
    Koppula, Neeraja
    Rani, B. Padmaja
    Rao, Koppula Srinivas
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS, ICCII 2016, 2017, 507 : 665 - 670
  • [42] Performance Evaluation of Constraints in Graph-Based Semi-supervised Clustering
    Yoshida, Tetsuya
    ACTIVE MEDIA TECHNOLOGY, 2010, 6335 : 138 - 149
  • [43] Graph-based algorithms for parallel processes
    Yordanova, S
    16TH ANNUAL INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2002, : 114 - 115
  • [44] Word Sense Disambiguation: A Graph-Based Approach Using N-Cliques Partitioning Technique
    Gutierrez, Yoan
    Vazquez, Sonia
    Montoyo, Andres
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2011, 6716 : 112 - 124
  • [45] Word Clustering Algorithms Based on Word Similarity
    Yuan, Lichi
    2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL I, 2015, : 21 - 24
  • [46] Memetic Algorithms of Graph-Based Estimation of Distribution Algorithms
    Maezawa, Kenta
    Handa, Hisashi
    PROCEEDINGS OF THE 18TH ASIA PACIFIC SYMPOSIUM ON INTELLIGENT AND EVOLUTIONARY SYSTEMS, VOL 2, 2015, : 647 - 656
  • [47] Graph-based Sequence Clustering through Multiobjective Evolutionary Algorithms for Web Recommender Systems
    Demir, Gul Nildem
    Uyar, A. Sima
    Oguducu, Sule
    GECCO 2007: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, 2007, : 1943 - 1950
  • [48] User modeling by graph-based induction
    Yoshida, K
    DESIGN OF COMPUTING SYSTEMS: SOCIAL AND ERGONOMIC CONSIDERATIONS, 1997, 21 : 23 - 26
  • [49] Graph-based induction and its applications
    Matsuda, T
    Motoda, H
    Washio, T
    ADVANCED ENGINEERING INFORMATICS, 2002, 16 (02) : 135 - 143
  • [50] A method for constructing word sense embeddings based on word sense induction
    Sun, Yujia
    Platos, Jan
    SCIENTIFIC REPORTS, 2023, 13 (01)