Clustering-Aware Negative Sampling for Unsupervised Sentence Representation

被引:0
|
作者
Deng, Jinghao [1 ]
Wan, Fanqi [1 ]
Yang, Tao [1 ]
Quan, Xiaojun [1 ]
Wang, Rui [2 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China
[2] Vipshop China Co Ltd, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
NEURAL-NETWORKS;
D O I
暂无
中图分类号
学科分类号
摘要
Contrastive learning has been widely studied in sentence representation learning. However, earlier works mainly focus on the construction of positive examples, while in-batch samples are often simply treated as negative examples. This approach overlooks the importance of selecting appropriate negative examples, potentially leading to a scarcity of hard negatives and the inclusion of false negatives. To address these issues, we propose ClusterNS (Clustering-aware Negative Sampling), a novel method that incorporates cluster information into contrastive learning for unsupervised sentence representation learning. We apply a modified K-means clustering algorithm to supply hard negatives and recognize in-batch false negatives during training, aiming to solve the two issues in one unified framework. Experiments on semantic textual similarity (STS) tasks demonstrate that our proposed ClusterNS compares favorably with baselines in unsupervised sentence representation learning. Our code has been made publicly available.1
引用
收藏
页码:8713 / 8729
页数:17
相关论文
共 50 条
  • [41] Role of Context in Unsupervised Sentence Representation Learning: the Case of Dialog Act Tagging
    Hronsky, Rastislav
    Keuleers, Emmanuel
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8784 - 8792
  • [42] Unsupervised Cross-Lingual Sentence Representation Learning via Linguistic Isomorphism
    Wang, Shuai
    Hou, Lei
    Tong, Meihan
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT II, 2019, 11776 : 215 - 226
  • [43] Unsupervised Geometric and Topological Approaches for Cross-Lingual Sentence Representation and Comparison
    Meirom, Shaked Haim
    Bobrowski, Omer
    PROCEEDINGS OF THE 7TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP, 2022, : 173 - 183
  • [44] An effective negative sampling approach for contrastive learning of sentence embedding
    Qitao Tan
    Xiaoying Song
    Guanghui Ye
    Chuan Wu
    Machine Learning, 2023, 112 : 4837 - 4861
  • [45] An effective negative sampling approach for contrastive learning of sentence embedding
    Tan, Qitao
    Song, Xiaoying
    Ye, Guanghui
    Wu, Chuan
    MACHINE LEARNING, 2023, 112 (12) : 4837 - 4861
  • [46] Contrastive sentence representation learning with adaptive false negative cancellation
    Xu, Lingling
    Xie, Haoran
    Wang, Fu Lee
    Tao, Xiaohui
    Wang, Weiming
    Li, Qing
    INFORMATION FUSION, 2024, 102
  • [47] Research Progress of Deep Clustering Based on Unsupervised Representation Learning
    Hou, Haiwei
    Ding, Shifei
    Xu, Xiao
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2022, 35 (11): : 999 - 1014
  • [48] Unsupervised Learning of Deep Feature Representation for Clustering Egocentric Actions
    Bhatnagar, Bharat Lal
    Singh, Suriya
    Arora, Chetan
    Jawahar, C., V
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1447 - 1453
  • [49] Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering
    Mei, Guofeng
    Saltori, Cristiano
    Ricci, Elisa
    Sebe, Nicu
    Wu, Qiang
    Zhang, Jian
    Poiesi, Fabio
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (08) : 3251 - 3269
  • [50] Deep Adaptive Fuzzy Clustering for Evolutionary Unsupervised Representation Learning
    Tan, Dayu
    Huang, Zheng
    Peng, Xin
    Zhong, Weimin
    Mahalec, Vladimir
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 6103 - 6117