COSDA: Covariance regularized semantic data augmentation for self-supervised visual representation learning

被引：0

作者：

Chen, Hui

Ma, Yongqiang

Jiang, Jingjing

Zheng, Nanning ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Natl Engn Res Ctr Visual Informat & Applicat, Natl Key Lab Human Machine Hybrid Augmented Intell, Xian 710049, Shaanxi, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2025年 / 311卷

基金：

中国国家自然科学基金;

关键词：

Self-supervised visual representation learning; Contrastive learning; Semantic data augmentation;

D O I：

10.1016/j.knosys.2025.113080

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent contrastive learning-based self-supervised learning has seen significant improvements through employing an extensive data augmentation strategy, particularly focusing on the generation of positive pairs. However, the current techniques primarily operate at the pixel level, confined to basic spatial and color transformations, thus lacking the capability to incorporate more complex semantic alterations such as object repositioning, rotation, or color modification within the image. Consequently, the resultant positive pairs are less informative for learning features that are invariant to such semantic variations. In this work, we introduce a new methodology termed COvariance Regularized Semantic Data Augmentation (COSDA), designed to generate a diverse collection of feature embeddings that serve as positives relative to an anchor point. These generated features are intended to possess distinct semantic characteristics from the anchor point while maintaining consistent category identities, accomplished through Gaussian sampling in the deep feature space. By theoretically analyzing the scenario where the number of generated positive features approaches infinity, we establish an upper bound for the InfoNCE loss and optimize this bound without explicit feature generation. Rigorous experimental assessments, conducted on datasets of varying scales, alongside downstream tasks encompassing detection and segmentation, corroborate the efficacy of COSDA.

引用

页数：10

共 50 条

[41] Contrastive Self-supervised Representation Learning Using Synthetic Data
Dong-Yu She
Kun Xu
International Journal of Automation and Computing, 2021, 18 (04) : 556 - 567
[42] Contrastive Self-supervised Representation Learning Using Synthetic Data
Dong-Yu She
Kun Xu
International Journal of Automation and Computing, 2021, 18 : 556 - 567
[43] Enhancing motion visual cues for self-supervised video representation learning
Nie, Mu
Quan, Zhibin
Ding, Weiping
Yang, Wankou
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
[44] Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Wu, Huimin
Lei, Chenyang
Sun, Xiao
Wang, Peng-Shuai
Chen, Qifeng
Cheng, Kwang-Ting
Lin, Stephen
Wu, Zhirong
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16259 - 16270
[45] Semantic-Aware Auto-Encoders for Self-supervised Representation Learning
Wang, Guangrun
Tang, Yansong
Lin, Liang
Torr, Philip H. S.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9654 - 9665
[46] Semantic Segmentation of Remote Sensing Images With Self-Supervised Multitask Representation Learning
Li, Wenyuan
Chen, Hao
Shi, Zhenwei
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 : 6438 - 6450
[47] Semantic Pose Verification for Outdoor Visual Localization with Self-supervised Contrastive Learning
Orhan, Semih
Guerrero, Jose J.
Bastanlar, Yalin
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3988 - 3997
[48] Self-Supervised Embodied Learning for Semantic Segmentation
Wang, Juan
Liu, Xinzhu
Zhao, Dawei
Dai, Bin
Liu, Huaping
2023 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, ICDL, 2023, : 383 - 390
[49] Self-Distilled Self-supervised Representation Learning
Jang, Jiho
Kim, Seonhoon
Yoo, Kiyoon
Kong, Chaerin
Kim, Jangho
Kwak, Nojun
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2828 - 2838
[50] Self-Supervised Graphs for Audio Representation Learning With Limited Labeled Data
Shirian, Amir
Somandepalli, Krishna
Guha, Tanaya
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) : 1391 - 1401

← 1 2 3 4 5 →