Grouped Contrastive Learning of Self-Supervised Sentence Representation

被引：1

作者：

Wang, Qian ^{[1
]}

Zhang, Weiqi ^{[1
]}

Lei, Tianyi ^{[1
]}

Peng, Dezhong ^{[1
,2
,3
]}

机构：

[1] Sichuan Univ, Coll Comp Sci & Technol, Chengdu 610065, Peoples R China

[2] Chengdu Ruibei Yingte Informat Technol Co Ltd, Chengdu 610054, Peoples R China

[3] Sichuan Zhiqian Technol Co Ltd, Chengdu 610065, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 17期

关键词：

contrastive learning; self-attention; data augmentation; grouped representation; unsupervised learning;

D O I：

10.3390/app13179873

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

This paper proposes a method called Grouped Contrastive Learning of self-supervised Sentence Representation (GCLSR), which can learn an effective and meaningful representation of sentences. Previous works maximize the similarity between two vectors to be the objective of contrastive learning, suffering from the high-dimensionality of the vectors. In addition, most previous works have adopted discrete data augmentation to obtain positive samples and have directly employed a contrastive framework from computer vision to perform contrastive training, which could hamper contrastive training because text data are discrete and sparse compared with image data. To solve these issues, we design a novel framework of contrastive learning, i.e., GCLSR, which divides the high-dimensional feature vector into several groups and respectively computes the groups' contrastive losses to make use of more local information, eventually obtaining a more fine-grained sentence representation. In addition, in GCLSR, we design a new self-attention mechanism and both a continuous and a partial-word vector augmentation (PWVA). For the discrete and sparse text data, the use of self-attention could help the model focus on the informative words by measuring the importance of every word in a sentence. By using the PWVA, GCLSR can obtain high-quality positive samples used for contrastive learning. Experimental results demonstrate that our proposed GCLSR achieves an encouraging result on the challenging datasets of the semantic textual similarity (STS) task and transfer task.

引用

页数：17

共 50 条

[31] Contrastive and Non-Contrastive Strategies for Federated Self-Supervised Representation Learning and Deep Clustering
Miao, Runxuan
Koyuncu, Erdem
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (06) : 1070 - 1084
[32] Cut-in maneuver detection with self-supervised contrastive video representation learning
Nalcakan, Yagiz
Bastanlar, Yalin
SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (06) : 2915 - 2923
[33] SELF-SUPERVISED CONTRASTIVE LEARNING FOR CROSS-DOMAIN HYPERSPECTRAL IMAGE REPRESENTATION
Lee, Hyungtae
Kwon, Heesung
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3239 - 3243
[34] Cross-View Temporal Contrastive Learning for Self-Supervised Video Representation
Wang, Lulu
Xu, Zengmin
Zhang, Xuelian
Meng, Ruxing
Lu, Tao
Computer Engineering and Applications, 2024, 60 (18) : 158 - 166
[35] TimeCLR: A self-supervised contrastive learning framework for univariate time series representation
Yang, Xinyu
Zhang, Zhenguo
Cui, Rongyi
KNOWLEDGE-BASED SYSTEMS, 2022, 245
[36] Generative Variational-Contrastive Learning for Self-Supervised Point Cloud Representation
Wang, Bohua
Tian, Zhiqiang
Ye, Aixue
Wen, Feng
Du, Shaoyi
Gao, Yue
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (09) : 6154 - 6166
[37] Contrastive self-supervised representation learning framework for metal surface defect detection
Mahe Zabin
Anika Nahian Binte Kabir
Muhammad Khubayeeb Kabir
Ho-Jin Choi
Jia Uddin
Journal of Big Data, 10
[38] Attentive spatial-temporal contrastive learning for self-supervised video representation
Yang, Xingming
Xiong, Sixuan
Wu, Kewei
Shan, Dongfeng
Xie, Zhao
IMAGE AND VISION COMPUTING, 2023, 137
[39] Contrastive self-supervised representation learning framework for metal surface defect detection
Zabin, Mahe
Kabir, Anika Nahian Binte
Kabir, Muhammad Khubayeeb
Choi, Ho-Jin
Uddin, Jia
JOURNAL OF BIG DATA, 2023, 10 (01)
[40] CARLA: Self-supervised contrastive representation learning for time series anomaly detection
Darban, Zahra Zamanzadeh
Webb, Geoffrey I.
Pan, Shirui
Aggarwal, Charu C.
Salehi, Mahsa
PATTERN RECOGNITION, 2025, 157

← 1 2 3 4 5 →