EMCRL: EM-Enhanced Negative Sampling Strategy for Contrastive Representation Learning

被引:0
|
作者
Zhang, Kun [1 ]
Lv, Guangyi [2 ]
Wu, Le [1 ]
Hong, Richang [1 ]
Wang, Meng [1 ]
机构
[1] Hefei Univ Technol, Sch Comp & Informat, Hefei 230029, Anhui, Peoples R China
[2] Lenovo Res, AI Lab, Beijing 100094, Peoples R China
基金
中国国家自然科学基金;
关键词
Representation learning; Data augmentation; Data models; Semantics; Optimization; Estimation; Sampling methods; Robustness; Natural languages; Crops; Contrastive learning (CL); expectation maximization (EM); negative examples; representation learning;
D O I
10.1109/TCSS.2024.3454056
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As one representative framework of self-supervised learning (SSL), contrastive learning (CL) has drawn enormous attention in the representation learning area. By pulling together a "positive" example and an anchor, as well as pushing away many "negative" examples from the anchor, CL is able to generate high-quality representations for the data of different modalities. Therefore, the qualities of selected positive and negative examples are critical for the performance of CL-based models. However, due to the assumption of label unavailability, most existing work follows the paradigm of contrastive instance discrimination, which treats each input instance as an individual category. Therefore, they focused more on positive example generation and designed plenty of data augmentation strategies. For negative examples, they just leverage the in-batch negative sampling strategy. We argue that this negative sampling strategy will easily select false negatives and inhibit the capability of CL, which we also believe is one of the reasons why a large size of negatives is needed in CL. Apart from using annotated labels, we try to tackle this problem in an unsupervised manner. We propose to integrate expectation maximization (EM) into the selection of negative examples and develop a novel EM-enhanced negative sampling strategy (EMCRL) to distinguish false negatives from true ones for CL performance improvement. Specifically, EMCRL employs EM to estimate the distribution of ground-truth relations between each sample and corresponding in-batch negatives and then optimizes model parameters with the estimations. Considering the sensitivity of EM algorithm to the parameter initialization, we propose to add a random flip into the distribution estimation to enhance the robustness of the learning process. Extensive experiments over several advanced models on sentence representation and image representation tasks demonstrate the effectiveness of EMCRL. Our method is easy to implement, and the code is publicly available at https://github.com/zhangkunzk/EMCRL_pytorch.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] A semantic framework for enhancing pseudo-relevance feedback with soft negative sampling and contrastive learning
    Pan, Min
    Zhou, Shuting
    Chen, Jinguang
    Huang, Ellen Anne
    Huang, Jimmy X.
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
  • [42] Q-SupCon: Quantum-Enhanced Supervised Contrastive Learning Architecture within the Representation Learning Framework
    Don, Asitha kottahachchi kankanamge
    Khalil, Ibrahim
    ACM TRANSACTIONS ON QUANTUM COMPUTING, 2025, 6 (01):
  • [43] Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning
    Liang, Meiyu
    Du, Junping
    Cao, Xiaowen
    Yu, Yang
    Lu, Kangkang
    Xue, Zhe
    Zhang, Min
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [44] Hierarchical Negative Sampling Based Graph Contrastive Learning Approach for Drug-Disease Association Prediction
    Wang, Yuanxu
    Song, Jinmiao
    Dai, Qiguo
    Duan, Xiaodong
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (05) : 3146 - 3157
  • [45] Contrastive self-supervised representation learning without negative samples for multimodal human action recognition
    Yang, Huaigang
    Ren, Ziliang
    Yuan, Huaqiang
    Xu, Zhenyu
    Zhou, Jun
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [46] GraSS: Contrastive Learning With Gradient-Guided Sampling Strategy for Remote Sensing Image Semantic Segmentation
    Zhang, Zhaoyang
    Ren, Zhen
    Tao, Chao
    Zhang, Yunsheng
    Peng, Chengli
    Li, Haifeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 14
  • [47] RiSSNet: Contrastive Learning Network with a Relaxed Identity Sampling Strategy for Remote Sensing Image Semantic Segmentation
    Li, Haifeng
    Jing, Wenxuan
    Wei, Guo
    Wu, Kai
    Su, Mingming
    Liu, Lu
    Wu, Hao
    Li, Penglong
    Qi, Ji
    REMOTE SENSING, 2023, 15 (13)
  • [48] Attributed network representation learning via improved graph attention with robust negative sampling
    Huilian Fan
    Yuanchang Zhong
    Guangpu Zeng
    Lili Sun
    Applied Intelligence, 2021, 51 : 416 - 426
  • [49] Attributed network representation learning via improved graph attention with robust negative sampling
    Fan, Huilian
    Zhong, Yuanchang
    Zeng, Guangpu
    Sun, Lili
    APPLIED INTELLIGENCE, 2021, 51 (01) : 416 - 426
  • [50] A Topology-Enhanced Multi-Viewed Contrastive Approach for Molecular Graph Representation Learning and Classification
    Pham, Phu
    MOLECULAR INFORMATICS, 2025, 44 (01)