Clustering-Aware Negative Sampling for Unsupervised Sentence Representation

被引:0
|
作者
Deng, Jinghao [1 ]
Wan, Fanqi [1 ]
Yang, Tao [1 ]
Quan, Xiaojun [1 ]
Wang, Rui [2 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China
[2] Vipshop China Co Ltd, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
NEURAL-NETWORKS;
D O I
暂无
中图分类号
学科分类号
摘要
Contrastive learning has been widely studied in sentence representation learning. However, earlier works mainly focus on the construction of positive examples, while in-batch samples are often simply treated as negative examples. This approach overlooks the importance of selecting appropriate negative examples, potentially leading to a scarcity of hard negatives and the inclusion of false negatives. To address these issues, we propose ClusterNS (Clustering-aware Negative Sampling), a novel method that incorporates cluster information into contrastive learning for unsupervised sentence representation learning. We apply a modified K-means clustering algorithm to supply hard negatives and recognize in-batch false negatives during training, aiming to solve the two issues in one unified framework. Experiments on semantic textual similarity (STS) tasks demonstrate that our proposed ClusterNS compares favorably with baselines in unsupervised sentence representation learning. Our code has been made publicly available.1
引用
收藏
页码:8713 / 8729
页数:17
相关论文
共 50 条
  • [21] Online Deep Clustering for Unsupervised Representation Learning
    Zhan, Xiaohang
    Xie, Jiahao
    Liu, Ziwei
    Ong, Yew-Soon
    Loy, Chen Change
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6687 - 6696
  • [22] Iterative Autoencoding and Clustering for Unsupervised Feature Representation
    Du, Songlin
    Ikenaga, Takeshi
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [23] Jigsaw Clustering for Unsupervised Visual Representation Learning
    Chen, Pengguang
    Liu, Shu
    Jia, Jiaya
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11521 - 11530
  • [24] Sentence Clustering Using Continuous Vector Space Representation
    Chinea-Rios, Mara
    Sanchis-Trilles, German
    Casacuberta, Francisco
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2015), 2015, 9117 : 432 - 440
  • [25] Sentence level matrix representation for document spectral clustering
    Mijangos, Victor
    Sierra, Gerardo
    Montes, Azucena
    PATTERN RECOGNITION LETTERS, 2017, 85 : 29 - 34
  • [26] Defect Clustering-Aware Spare-TSV Allocation in 3-D ICs for Yield Enhancement
    Wan, Shengcheng
    Chakrabarty, Krishnendu
    Tahoori, Mehdi B.
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (10) : 1928 - 1941
  • [27] Unsupervised Sentence Representation Learning with Frequency-induced Adversarial tuning and Incomplete sentence filtering
    Wang, Bing
    Li, Ximing
    Yang, Zhiyao
    Guan, Yuanyuan
    Li, Jiayin
    Wang, Shengsheng
    NEURAL NETWORKS, 2024, 175
  • [28] Contrastive Learning for Unsupervised Sentence Embedding with False Negative Calibration
    Chiu, Chi-Min
    Lin, Ying-Jia
    Kao, Hung-Yu
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 290 - 301
  • [29] Augmentation blending with clustering-aware outlier factor: An outlier-driven perspective for enhanced contrastive learning
    Meng, Qianwen
    Qian, Hangwei
    Xu, Yonghui
    Cui, Lizhen
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [30] False Negative Sample Aware Negative Sampling for Recommendation
    Chen, Liguo
    Gong, Zhigang
    Xie, Hong
    Zhou, Mingqiang
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT V, PAKDD 2024, 2024, 14649 : 195 - 206