Learning What and Where to Learn: A New Perspective on Self-Supervised Learning

被引：4

作者：

Zhao, Wenyi ^{[1
]}

Yang, Lu ^{[1
]}

Zhang, Weidong ^{[2
]}

Tian, Yongqin ^{[2
]}

Jia, Wenhe ^{[1
]}

Li, Wei ^{[1
]}

Yang, Mu ^{[3
]}

Pan, Xipeng ^{[4
]}

Yang, Huihua ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China

[2] Henan Inst Sci & Technol, Sch Informat Engn, Xinxiang 453003, Peoples R China

[3] Techmach Beijing Ind Technol Co Ltd, Beijing 102676, Peoples R China

[4] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Semantics; Feature extraction; Task analysis; Computational modeling; Optimization; Self-supervised learning; Training; learning what; learning where; efficient framework; positional information;

D O I：

10.1109/TCSVT.2023.3298937

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Self-supervised learning (SSL) has demonstrated its power in generalized model acquisition by leveraging the discriminative semantic and explicit positional information of unlabeled datasets. Unfortunately, mainstream contrastive learning-based methods excessive focus on semantic information and ignore the position is also the carrier of image content, resulting in inadequate data utilization and extensive computational consumption. To address these issues, we present an efficient SSL framework, learning What and Where to learn ( $\text {W}<^>{2} \text {SSL}$ ), to aggregate semantic and position features. Concretely, we devise a spatially-coupled sampling manner to process images through pre-defined rules, which integrates the advantage of semantic (What) and positional (Where) features into framework to enrich the diversity of feature representation capabilities and improve data utilization. Besides, a spectrum of latent vectors is obtained by mapping the positional features, which implicitly explores the relationship between these vectors. Whereafter, the corresponding discriminative and contrastive optimization objectives are seamlessly embedded in the framework via a cascade paradigm to explore semantic and positional features. The proposed $\text {W}<^>{2} \text {SSL}$ is verified on different types of datasets, which demonstrates that it still outperforms state-of-the-art SSL methods even with half the computational consumption. Code will be available at https://github.com/WilyZhao8/W2SSL.

引用

页码：6620 / 6633

页数：14

共 50 条

[1] Learning Where to Learn in Cross-View Self-Supervised Learning
Huang, Lang
You, Shan
Zheng, Mingkai
Wang, Fei
Qian, Chen
Yamasaki, Toshihiko
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14431 - 14440
[2] A New Self-supervised Method for Supervised Learning
Yang, Yuhang
Ding, Zilin
Cheng, Xuan
Wang, Xiaomin
Liu, Ming
INTERNATIONAL CONFERENCE ON COMPUTER VISION, APPLICATION, AND DESIGN (CVAD 2021), 2021, 12155
[3] A comprehensive perspective of contrastive self-supervised learning
Songcan CHEN
Chuanxing GENG
Frontiers of Computer Science, 2021, (04) : 102 - 104
[4] A comprehensive perspective of contrastive self-supervised learning
Songcan Chen
Chuanxing Geng
Frontiers of Computer Science, 2021, 15
[5] A comprehensive perspective of contrastive self-supervised learning
Chen, Songcan
Geng, Chuanxing
FRONTIERS OF COMPUTER SCIENCE, 2021, 15 (04)
[6] What Should Be Equivariant In Self-Supervised Learning
Xie, Yuyang
Wen, Jianhong
Lau, Kin Wai
Rehman, Yasar Abbas Ur
Shen, Jiajun
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4110 - 4119
[7] How and What to Learn: Taxonomizing Self-Supervised Learning for 3D Action Recognition
Ben Tanfous, Amor
Zerroug, Aimen
Linsley, Drew
Serre, Thomas
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2888 - 2897
[8] Gated Self-supervised Learning for Improving Supervised Learning
Fuadi, Erland Hillman
Ruslim, Aristo Renaldo
Wardhana, Putu Wahyu Kusuma
Yudistira, Novanto
2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 611 - 615
[9] Inpaint2Learn: A Self-Supervised Framework for Affordance Learning
Zhang, Lingzhi
Du, Weiyu
Zhou, Shenghao
Wang, Jiancong
Shi, Jianbo
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3778 - 3787
[10] Repeat and learn: Self-supervised visual representations learning by Scene Localization
Altabrawee, Hussein
Noor, Mohd Halim Mohd
PATTERN RECOGNITION, 2024, 156

← 1 2 3 4 5 →