Clustering-Aware Negative Sampling for Unsupervised Sentence Representation

被引:0
|
作者
Deng, Jinghao [1 ]
Wan, Fanqi [1 ]
Yang, Tao [1 ]
Quan, Xiaojun [1 ]
Wang, Rui [2 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China
[2] Vipshop China Co Ltd, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
NEURAL-NETWORKS;
D O I
暂无
中图分类号
学科分类号
摘要
Contrastive learning has been widely studied in sentence representation learning. However, earlier works mainly focus on the construction of positive examples, while in-batch samples are often simply treated as negative examples. This approach overlooks the importance of selecting appropriate negative examples, potentially leading to a scarcity of hard negatives and the inclusion of false negatives. To address these issues, we propose ClusterNS (Clustering-aware Negative Sampling), a novel method that incorporates cluster information into contrastive learning for unsupervised sentence representation learning. We apply a modified K-means clustering algorithm to supply hard negatives and recognize in-batch false negatives during training, aiming to solve the two issues in one unified framework. Experiments on semantic textual similarity (STS) tasks demonstrate that our proposed ClusterNS compares favorably with baselines in unsupervised sentence representation learning. Our code has been made publicly available.1
引用
收藏
页码:8713 / 8729
页数:17
相关论文
共 50 条
  • [1] Unsupervised Path Representation Learning with Curriculum Negative Sampling
    Bin Yang, Sean
    Guo, Chenjuan
    Hu, Jilin
    Tang, Jian
    Yang, Bin
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3286 - 3292
  • [2] Clustering-Aware Graph Construction: A Joint Learning Perspective
    Jia, Yuheng
    Liu, Hui
    Hou, Junhui
    Kwong, Sam
    IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2020, 6 : 357 - 370
  • [3] Clustering-Aware Structure-Constrained Low-Rank Submodule Clustering
    Wu, Tong
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1852 - 1856
  • [4] Attributed multiplex graph clustering: A heuristic clustering-aware network embedding approach
    Han, Beibei
    Wei, Yingmei
    Kang, Lai
    Wang, Qingyong
    Feng, Suru
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2022, 592
  • [5] AdaNS: Adaptive negative sampling for unsupervised graph representation learning
    Wang, Yu
    Hu, Liang
    Gao, Wanfu
    Cao, Xiaofeng
    Chang, Yi
    PATTERN RECOGNITION, 2023, 136
  • [6] CLUSTERING-AWARE STRUCTURE-CONSTRAINED LOW-RANK REPRESENTATION MODEL FOR LEARNING HUMAN ACTION ATTRIBUTES
    Wu, Tong
    Gurram, Prudhvi
    Rao, Raghuveer M.
    Bajwa, Waheed U.
    2016 IEEE 12TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2016,
  • [7] Deep boundary-aware clustering by jointly optimizing unsupervised representation learning
    Ru Wang
    Lin Li
    Peipei Wang
    Xiaohui Tao
    Peiyu Liu
    Multimedia Tools and Applications, 2022, 81 : 34309 - 34324
  • [8] Unsupervised Topic Aware Document-Level Semantic Representation for Document Clustering
    Rafi, Muhammad
    Khan, Hamza
    Nadeem, Haya
    Shakeel, Hassan
    2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 170 - 179
  • [9] Deep boundary-aware clustering by jointly optimizing unsupervised representation learning
    Wang, Ru
    Li, Lin
    Wang, Peipei
    Tao, Xiaohui
    Liu, Peiyu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 34309 - 34324
  • [10] PLANET MINERAL DISTRIBUTION DETECTION VIA CLUSTERING-AWARE NONNEGATIVE MATRIX FACTORIZATION
    Yin, Jihao
    Huang, Chenyu
    Luo, Xiaoyan
    Qv, Hui
    Liu, Xiang
    Han, Bingnan
    2016 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2016, : 5880 - 5883