SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

被引:10
|
作者
Wang, Hao [1 ]
Dou, Yong [1 ]
机构
[1] Natl Univ Def Technol, Changsha 410073, Peoples R China
关键词
Unsupervised Sentence Embedding; Contrastive Learning; Feature Suppression; Soft Negative Samples; Bidirectional Margin Loss;
D O I
10.1007/978-981-99-4752-2_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised sentence embedding aims to obtain the most appropriate embedding for a sentence to reflect its semantics. Contrastive learning has been attracting developing attention. For a sentence, current models utilize diverse data augmentation methods to generate positive samples, while consider other independent sentences as negative samples. Then they adopt InfoNCE loss to pull the embeddings of positive pairs gathered, and push those of negative pairs scattered. Although these models have made great progress, we argue that they may suffer from feature suppression, where the models fail to distinguish and decouple textual similarity and semantic similarity. They may overestimate the semantic similarity of any sentence pairs with similar text regardless of the actual semantic difference between them, and vice versa. Herein, we propose contrastive learning for unsupervised sentence embedding with soft negative samples (SNCSE). Soft negative samples share highly similar text but have surely and apparently different semantics with the original samples. Specifically, we take the negation of original sentences as soft negative samples, and propose BidirectionalMargin Loss (BML) to introduce them into traditional contrastive learning framework. Our experimental results on semantic textual similarity (STS) task show that SNCSE can obtain state-of-the-art performance with different encoders, indicating its strength on unsupervised sentence embedding. Our code and models are released at https:// github.com/Sense-GVT/SNCSE.
引用
收藏
页码:419 / 431
页数:13
相关论文
共 50 条
  • [1] Contrastive Learning for Unsupervised Sentence Embedding with False Negative Calibration
    Chiu, Chi-Min
    Lin, Ying-Jia
    Kao, Hung-Yu
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 290 - 301
  • [2] Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding
    He, Hongliang
    Zhang, Junlei
    Lan, Zhenzhong
    Zhang, Yue
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 12863 - 12871
  • [3] Keyword Extractor for Contrastive Learning of Unsupervised Sentence Embedding
    Cai, Hua
    Chen, Weihong
    Shi, Kehuan
    Li, Shuaishuai
    Xu, Qing
    2022 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING, MLNLP 2022, 2022, : 88 - 93
  • [4] Contrastive learning for unsupervised sentence embeddings using negative samples with diminished semantics
    Zhiyi Yu
    Hong Li
    Jialin Feng
    The Journal of Supercomputing, 2024, 80 : 5428 - 5445
  • [5] Contrastive learning for unsupervised sentence embeddings using negative samples with diminished semantics
    Yu, Zhiyi
    Li, Hong
    Feng, Jialin
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (04): : 5428 - 5445
  • [6] Exploring the Impact of Negative Samples of Contrastive Learning: A Case Study of Sentence Embedding
    Cao, Rui
    Wang, Yihao
    Liang, Yuxin
    Gao, Ling
    Zheng, Jie
    Ren, Jie
    Wang, Zheng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3138 - 3152
  • [7] Prefix Data Augmentation for Contrastive Learning of Unsupervised Sentence Embedding
    Wang, Chunchun
    Lv, Shu
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [8] DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing Perspective
    Miao, Pu
    Du, Zeyao
    Zhang, Junlin
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 1847 - 1856
  • [9] An effective negative sampling approach for contrastive learning of sentence embedding
    Qitao Tan
    Xiaoying Song
    Guanghui Ye
    Chuan Wu
    Machine Learning, 2023, 112 : 4837 - 4861
  • [10] An effective negative sampling approach for contrastive learning of sentence embedding
    Tan, Qitao
    Song, Xiaoying
    Ye, Guanghui
    Wu, Chuan
    MACHINE LEARNING, 2023, 112 (12) : 4837 - 4861