Word sense induction with agglomerative clustering and mutual information maximization

被引:0
|
作者
Abdine, Hadi [1 ]
Eddine, Moussa Kamal [1 ]
Buscaldi, Davide [2 ]
Vazirgiannis, Michalis [1 ]
机构
[1] Ecole Polytech, LIX, Palaiseau, France
[2] Univ Sorbonne Paris Nord, LIPN, Paris, France
来源
AI OPEN | 2023年 / 4卷
关键词
Word sense induction; Unsupervised machine learning; Natural language processing; Transformer; BERT; Mutual information; Clustering;
D O I
10.1016/j.aiopen.2023.12.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word sense induction (WSI) is a challenging problem in natural language processing that involves the unsupervised automatic detection of a word's senses (i.e., meanings). Recent work achieves significant results on the WSI task by pre -training a language model that can exclusively disambiguate word senses. In contrast, others employ off-the-shelf pre-trained language models with additional strategies to induce senses. This paper proposes a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC). The IIC loss is used to train a small model to optimize the mutual information between two vector representations of a target word occurring in a pair of synthetic paraphrases. This model is later used in inference mode to extract a higher-quality vector representation to be used in the hierarchical clustering. We evaluate our method on two WSI tasks and in two distinct clustering configurations (fixed and dynamic number of clusters). We empirically show that our approach is at least on par with the state -of -the -art baselines, outperforming them in several configurations. The code and data to reproduce this work are available to the public 1 .
引用
收藏
页码:193 / 201
页数:9
相关论文
共 50 条
  • [1] Agglomerative hierarchical clustering of continuous variables based on mutual information
    Kojadinovic, I
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2004, 46 (02) : 269 - 294
  • [2] A hierarchical clustering based on mutual information maximization
    Aghagolzadeh, M.
    Soltanian-Zadeh, H.
    Araabi, B.
    Aghagolzadeh, A.
    2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 277 - +
  • [3] Deep node clustering based on mutual information maximization
    Molaei, Soheila
    Bousejin, Nima Ghanbari
    Zare, Hadi
    Jalili, Mahdi
    NEUROCOMPUTING, 2021, 455 : 274 - 282
  • [4] Learning Deep Generative Clustering via Mutual Information Maximization
    Yang, Xiaojiang
    Yan, Junchi
    Cheng, Yu
    Zhang, Yizhe
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 6263 - 6275
  • [5] Variational Deep Embedding Clustering by Augmented Mutual Information Maximization
    Ji, Qiang
    Sun, Yanfeng
    Hu, Yongli
    Yin, Baocai
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2196 - 2202
  • [6] Information-Maximization Clustering Based on Squared-Loss Mutual Information
    Sugiyama, Masashi
    Niu, Gang
    Yamada, Makoto
    Kimura, Manabu
    Hachiya, Hirotaka
    NEURAL COMPUTATION, 2014, 26 (01) : 84 - 131
  • [7] Deep graph clustering via mutual information maximization and mixture model
    Ahmadi, Maedeh
    Safayani, Mehran
    Mirzaei, Abdolreza
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (08) : 4549 - 4572
  • [8] Dependence-Maximization Clustering with Least-Squares Mutual Information
    Kimura, Manabu
    Sugiyama, Masashi
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2011, 15 (07) : 800 - 805
  • [9] AGGLOMERATIVE CLUSTERING USING CONCEPT OF MUTUAL NEAREST NEIGHBORHOOD
    GOWDA, KC
    KRISHNA, G
    PATTERN RECOGNITION, 1978, 10 (02) : 105 - 112
  • [10] Word sense disambiguation based on word sense clustering
    Anaya-Sanchez, Henry
    Pons-Porrata, Aurora
    Berlanga-Llavori, Rafael
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA-SBIA 2006, PROCEEDINGS, 2006, 4140 : 472 - 481