Word sense induction with agglomerative clustering and mutual information maximization

被引：0

作者：

Abdine, Hadi ^{[1
]}

Eddine, Moussa Kamal ^{[1
]}

Buscaldi, Davide ^{[2
]}

Vazirgiannis, Michalis ^{[1
]}

机构：

[1] Ecole Polytech, LIX, Palaiseau, France

[2] Univ Sorbonne Paris Nord, LIPN, Paris, France

来源：

AI OPEN | 2023年 / 4卷

关键词：

Word sense induction; Unsupervised machine learning; Natural language processing; Transformer; BERT; Mutual information; Clustering;

D O I：

10.1016/j.aiopen.2023.12.001

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Word sense induction (WSI) is a challenging problem in natural language processing that involves the unsupervised automatic detection of a word's senses (i.e., meanings). Recent work achieves significant results on the WSI task by pre -training a language model that can exclusively disambiguate word senses. In contrast, others employ off-the-shelf pre-trained language models with additional strategies to induce senses. This paper proposes a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC). The IIC loss is used to train a small model to optimize the mutual information between two vector representations of a target word occurring in a pair of synthetic paraphrases. This model is later used in inference mode to extract a higher-quality vector representation to be used in the hierarchical clustering. We evaluate our method on two WSI tasks and in two distinct clustering configurations (fixed and dynamic number of clusters). We empirically show that our approach is at least on par with the state -of -the -art baselines, outperforming them in several configurations. The code and data to reproduce this work are available to the public 1 .

引用

页码：193 / 201

页数：9

共 50 条

[1] Agglomerative hierarchical clustering of continuous variables based on mutual information
Kojadinovic, I
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2004, 46 (02) : 269 - 294
[2] A hierarchical clustering based on mutual information maximization
Aghagolzadeh, M.
Soltanian-Zadeh, H.
Araabi, B.
Aghagolzadeh, A.
2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 277 - +
[3] Deep node clustering based on mutual information maximization
Molaei, Soheila
Bousejin, Nima Ghanbari
Zare, Hadi
Jalili, Mahdi
NEUROCOMPUTING, 2021, 455 : 274 - 282
[4] Learning Deep Generative Clustering via Mutual Information Maximization
Yang, Xiaojiang
Yan, Junchi
Cheng, Yu
Zhang, Yizhe
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 6263 - 6275
[5] Variational Deep Embedding Clustering by Augmented Mutual Information Maximization
Ji, Qiang
Sun, Yanfeng
Hu, Yongli
Yin, Baocai
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2196 - 2202
[6] Information-Maximization Clustering Based on Squared-Loss Mutual Information
Sugiyama, Masashi
Niu, Gang
Yamada, Makoto
Kimura, Manabu
Hachiya, Hirotaka
NEURAL COMPUTATION, 2014, 26 (01) : 84 - 131
[7] Deep graph clustering via mutual information maximization and mixture model
Ahmadi, Maedeh
Safayani, Mehran
Mirzaei, Abdolreza
KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (08) : 4549 - 4572
[8] Dependence-Maximization Clustering with Least-Squares Mutual Information
Kimura, Manabu
Sugiyama, Masashi
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2011, 15 (07) : 800 - 805
[9] AGGLOMERATIVE CLUSTERING USING CONCEPT OF MUTUAL NEAREST NEIGHBORHOOD
GOWDA, KC
KRISHNA, G
PATTERN RECOGNITION, 1978, 10 (02) : 105 - 112
[10] Word sense disambiguation based on word sense clustering
Anaya-Sanchez, Henry
Pons-Porrata, Aurora
Berlanga-Llavori, Rafael
ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA-SBIA 2006, PROCEEDINGS, 2006, 4140 : 472 - 481

← 1 2 3 4 5 →