Disturbed Augmentation Invariance for Unsupervised Visual Representation Learning

被引:3
|
作者
Cheng, Haoyang [1 ]
Li, Hongliang [1 ]
Wu, Qingbo [1 ]
Qiu, Heqian [1 ]
Zhang, Xiaoliang [1 ]
Meng, Fanman [1 ]
Zhao, Taijin [1 ]
机构
[1] Univ Elect Sci & Technol China UESTC, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
基金
中国国家自然科学基金;
关键词
Unsupervised learning; self-supervised learning; representation learning; contrastive learning; convolutional neural network;
D O I
10.1109/TCSVT.2023.3272741
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Contrastive learning has gained great prominence recently, which achieves excellent performance by simple augmentation invariance. However, the simple contrastive pairs suffer from lacking of diversity due to the mechanical augmentation strategies. In this paper, we propose Disturbed Augmentation Invariance (DAI for abbreviation), which constructs disturbed contrastive pairs by generating appropriate disturbed views for each augmented view in the feature space to increase the diversity. In practice, we establish a multivariate normal distribution for each augmented view, whose mean is corresponding augmented view and covariance matrix is estimated from its nearest neighbors in the dataset. Then we sample random vectors from this distribution as the disturbed views to construct disturbed contrastive pairs. In order to avoid extra computational cost with the increase of disturbed contrastive pairs, we utilize an upper bound of the trivial disturbed augmentation invariance loss to construct the DAI loss. In addition, we propose Bottleneck version of Disturbed Augmentation Invariance (BDAI for abbreviation) inspired by the Information Bottleneck principle, which further refines the extracted information and learns a compact representation by additionally increasing the variance of the original contrastive pair. In order to make BDAI work effectively, we design a statistical strategy to control the balance between the amount of the information shared by all disturbed contrastive pairs and the compactness of the representation. Our approach gets a consistent improvement over the popular contrastive learning methods on a variety of downstream tasks, e.g. image classification, object detection and instance segmentation.
引用
收藏
页码:6924 / 6938
页数:15
相关论文
共 50 条
  • [1] Unsupervised Representation Learning by Invariance Propagation
    Wang, Feng
    Liu, Huaping
    Guo, Di
    Sun, Fuchun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [2] Data Augmentation, Internal Representation, and Unsupervised Learning
    Wu, Ying Nian
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2011, 20 (03) : 581 - 583
  • [3] Unsupervised Visual Attention and Invariance for Reinforcement Learning
    Wang, Xudong
    Lian, Long
    Yu, Stella X.
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6673 - 6683
  • [4] Data Augmentation, Internal Representation, and Unsupervised Learning Comment
    Kelly, Brandon C.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2011, 20 (03) : 584 - 591
  • [5] Unsupervised Representation Learning for Visual Robotics Grasping
    Wang, Shaochen
    Zhou, Zhangli
    Wang, Hao
    Li, Zhijun
    Kan, Zhen
    2022 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2022), 2022, : 57 - 62
  • [6] Jigsaw Clustering for Unsupervised Visual Representation Learning
    Chen, Pengguang
    Liu, Shu
    Jia, Jiaya
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11521 - 11530
  • [7] Unsupervised Visual Representation Learning by Context Prediction
    Doersch, Carl
    Gupta, Abhinav
    Efros, Alexei A.
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1422 - 1430
  • [8] Momentum Contrast for Unsupervised Visual Representation Learning
    He, Kaiming
    Fan, Haoqi
    Wu, Yuxin
    Xie, Saining
    Girshick, Ross
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 9726 - 9735
  • [9] Unsupervised Visual Representation Learning by Synchronous Momentum Grouping
    Pang, Bo
    Zhang, Yifan
    Li, Yaoyi
    Cai, Jia
    Lu, Cewu
    COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 265 - 282
  • [10] Hallucination Improves the Performance of Unsupervised Visual Representation Learning
    Wu, Jing
    Hobbs, Jennifer
    Hovakimyan, Naira
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16086 - 16097