Learning What and Where to Learn: A New Perspective on Self-Supervised Learning

被引:4
|
作者
Zhao, Wenyi [1 ]
Yang, Lu [1 ]
Zhang, Weidong [2 ]
Tian, Yongqin [2 ]
Jia, Wenhe [1 ]
Li, Wei [1 ]
Yang, Mu [3 ]
Pan, Xipeng [4 ]
Yang, Huihua [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China
[2] Henan Inst Sci & Technol, Sch Informat Engn, Xinxiang 453003, Peoples R China
[3] Techmach Beijing Ind Technol Co Ltd, Beijing 102676, Peoples R China
[4] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Feature extraction; Task analysis; Computational modeling; Optimization; Self-supervised learning; Training; learning what; learning where; efficient framework; positional information;
D O I
10.1109/TCSVT.2023.3298937
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Self-supervised learning (SSL) has demonstrated its power in generalized model acquisition by leveraging the discriminative semantic and explicit positional information of unlabeled datasets. Unfortunately, mainstream contrastive learning-based methods excessive focus on semantic information and ignore the position is also the carrier of image content, resulting in inadequate data utilization and extensive computational consumption. To address these issues, we present an efficient SSL framework, learning What and Where to learn ( $\text {W}<^>{2} \text {SSL}$ ), to aggregate semantic and position features. Concretely, we devise a spatially-coupled sampling manner to process images through pre-defined rules, which integrates the advantage of semantic (What) and positional (Where) features into framework to enrich the diversity of feature representation capabilities and improve data utilization. Besides, a spectrum of latent vectors is obtained by mapping the positional features, which implicitly explores the relationship between these vectors. Whereafter, the corresponding discriminative and contrastive optimization objectives are seamlessly embedded in the framework via a cascade paradigm to explore semantic and positional features. The proposed $\text {W}<^>{2} \text {SSL}$ is verified on different types of datasets, which demonstrates that it still outperforms state-of-the-art SSL methods even with half the computational consumption. Code will be available at https://github.com/WilyZhao8/W2SSL.
引用
收藏
页码:6620 / 6633
页数:14
相关论文
共 50 条
  • [1] Learning Where to Learn in Cross-View Self-Supervised Learning
    Huang, Lang
    You, Shan
    Zheng, Mingkai
    Wang, Fei
    Qian, Chen
    Yamasaki, Toshihiko
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14431 - 14440
  • [2] A New Self-supervised Method for Supervised Learning
    Yang, Yuhang
    Ding, Zilin
    Cheng, Xuan
    Wang, Xiaomin
    Liu, Ming
    INTERNATIONAL CONFERENCE ON COMPUTER VISION, APPLICATION, AND DESIGN (CVAD 2021), 2021, 12155
  • [3] A comprehensive perspective of contrastive self-supervised learning
    Songcan CHEN
    Chuanxing GENG
    Frontiers of Computer Science, 2021, (04) : 102 - 104
  • [4] A comprehensive perspective of contrastive self-supervised learning
    Songcan Chen
    Chuanxing Geng
    Frontiers of Computer Science, 2021, 15
  • [5] A comprehensive perspective of contrastive self-supervised learning
    Chen, Songcan
    Geng, Chuanxing
    FRONTIERS OF COMPUTER SCIENCE, 2021, 15 (04)
  • [6] What Should Be Equivariant In Self-Supervised Learning
    Xie, Yuyang
    Wen, Jianhong
    Lau, Kin Wai
    Rehman, Yasar Abbas Ur
    Shen, Jiajun
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4110 - 4119
  • [7] How and What to Learn: Taxonomizing Self-Supervised Learning for 3D Action Recognition
    Ben Tanfous, Amor
    Zerroug, Aimen
    Linsley, Drew
    Serre, Thomas
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2888 - 2897
  • [8] Gated Self-supervised Learning for Improving Supervised Learning
    Fuadi, Erland Hillman
    Ruslim, Aristo Renaldo
    Wardhana, Putu Wahyu Kusuma
    Yudistira, Novanto
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 611 - 615
  • [9] Inpaint2Learn: A Self-Supervised Framework for Affordance Learning
    Zhang, Lingzhi
    Du, Weiyu
    Zhou, Shenghao
    Wang, Jiancong
    Shi, Jianbo
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3778 - 3787
  • [10] Repeat and learn: Self-supervised visual representations learning by Scene Localization
    Altabrawee, Hussein
    Noor, Mohd Halim Mohd
    PATTERN RECOGNITION, 2024, 156