Clustering-Aware Negative Sampling for Unsupervised Sentence Representation

被引:0
|
作者
Deng, Jinghao [1 ]
Wan, Fanqi [1 ]
Yang, Tao [1 ]
Quan, Xiaojun [1 ]
Wang, Rui [2 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou, Peoples R China
[2] Vipshop China Co Ltd, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
NEURAL-NETWORKS;
D O I
暂无
中图分类号
学科分类号
摘要
Contrastive learning has been widely studied in sentence representation learning. However, earlier works mainly focus on the construction of positive examples, while in-batch samples are often simply treated as negative examples. This approach overlooks the importance of selecting appropriate negative examples, potentially leading to a scarcity of hard negatives and the inclusion of false negatives. To address these issues, we propose ClusterNS (Clustering-aware Negative Sampling), a novel method that incorporates cluster information into contrastive learning for unsupervised sentence representation learning. We apply a modified K-means clustering algorithm to supply hard negatives and recognize in-batch false negatives during training, aiming to solve the two issues in one unified framework. Experiments on semantic textual similarity (STS) tasks demonstrate that our proposed ClusterNS compares favorably with baselines in unsupervised sentence representation learning. Our code has been made publicly available.1
引用
收藏
页码:8713 / 8729
页数:17
相关论文
共 50 条
  • [31] Reinforced Multi-teacher Knowledge Distillation for Unsupervised Sentence Representation
    Wang, Xintao
    Jin, Rize
    Qi, Shibo
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT VII, 2024, 15022 : 320 - 332
  • [32] Assessing Search and Unsupervised Clustering Algorithms in Nested Sampling
    Maillard, Lune
    Finocchi, Fabio
    Trassinelli, Martino
    ENTROPY, 2023, 25 (02)
  • [33] Unsupervised Node Clustering via Contrastive Hard Sampling
    Cui, Hang
    Abdelzaher, Tarek
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT VI, DASFAA 2024, 2024, 14855 : 285 - 300
  • [34] Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation
    Zhao, Tiancheng
    Lee, Kyusong
    Eskenazi, Maxine
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1098 - 1107
  • [35] Clustering-Based Relational Unsupervised Representation Learning with an Explicit Distributed Representation
    Dumancic, Sebastijan
    Blockeel, Hendrik
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1631 - 1637
  • [36] Imbalance-Aware Discriminative Clustering for Unsupervised Semantic Segmentation
    Liu, Mingyuan
    Zhang, Jicong
    Tang, Wei
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (10) : 4362 - 4378
  • [37] SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples
    Wang, Hao
    Dou, Yong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 : 419 - 431
  • [38] Unsupervised Extractive Text Summarization Using Frequency-Based Sentence Clustering
    Hajjar, Ali
    Tekli, Joe
    NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 1652 : 245 - 255
  • [39] Discrepancy-Aware Collaborative Representation for Unsupervised Domain Adaptation
    Han, Chao
    Zhou, Deyun
    Xie, Yu
    Lei, Yu
    Shi, Jiao
    Gong, Maoguo
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [40] Structure Aware Negative Sampling in Knowledge Graphs
    Ahrabian, Kian
    Feizi, Aarash
    Salehi, Yasmin
    Hamilton, William L.
    Bose, Avishek Joey
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6093 - 6101