Evolved Hierarchical Masking for Self-Supervised Learning

被引:0
|
作者
Feng, Zhanzhou [1 ]
Zhang, Shiliang [1 ,2 ]
机构
[1] Peking Univ, Sch Comp Sci, State Key Lab Multimedia Informat Proc, Beijing 100871, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518055, Peoples R China
关键词
Visualization; Training; Semantics; Self-supervised learning; Semantic segmentation; Image classification; Computational modeling; Neurons; Representation learning; Predictive models; masked image modeling; efficient learning; model pretraining;
D O I
10.1109/TPAMI.2024.3490776
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing Masked Image Modeling methods apply fixed mask patterns to guide the self-supervised training. As those mask patterns resort to different criteria to depict image contents, sticking to a fixed pattern leads to a limited vision cues modeling capability. This paper introduces an evolved hierarchical masking method to pursue general visual cues modeling in self-supervised learning. The proposed method leverages the vision model being trained to parse the input visual cues into a hierarchy structure, which is hence adopted to generate masks accordingly. The accuracy of hierarchy is on par with the capability of the model being trained, leading to evolved mask patterns at different training stages. Initially, generated masks focus on low-level visual cues to grasp basic textures, then gradually evolve to depict higher-level cues to reinforce the learning of more complicated object semantics and contexts. Our method does not require extra pre-trained models or annotations and ensures training efficiency by evolving the training difficulty. We conduct extensive experiments on seven downstream tasks including partial-duplicate image retrieval relying on low-level details, as well as image classification and semantic segmentation that require semantic parsing capability. Experimental results demonstrate that it substantially boosts performance across these tasks. For instance, it surpasses the recent MAE by 1.1% in imageNet-1K classification and 1.4% in ADE20K segmentation with the same training epochs. We also align the proposed method with the current research focus on LLMs. The proposed approach bridges the gap with large-scale pre-training on semantic demanding tasks and enhances intricate detail perception in tasks requiring low-level feature recognition.
引用
收藏
页码:1013 / 1027
页数:15
相关论文
共 50 条
  • [21] A Hierarchical Vision Transformer Using Overlapping Patch and Self-Supervised Learning
    Ma, Yaxin
    Li, Ming
    Chang, Jun
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [22] Self-supervised learning for accurately modelling hierarchical evolutionary patterns of cerebrovasculature
    Guo, Bin
    Chen, Ying
    Lin, Jinping
    Huang, Bin
    Bai, Xiangzhuo
    Guo, Chuanliang
    Gao, Bo
    Gong, Qiyong
    Bai, Xiangzhi
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [23] Deep Self-Supervised Diversity Promoting Learning on Hierarchical Hyperspheres for Regularization
    Kim, Youngsung
    Hyun, Yoonsuk
    Han, Jae-Joon
    Yang, Eunho
    Hwang, Sung Ju
    Shin, Jinwoo
    IEEE ACCESS, 2023, 11 : 146208 - 146222
  • [24] Gated Self-supervised Learning for Improving Supervised Learning
    Fuadi, Erland Hillman
    Ruslim, Aristo Renaldo
    Wardhana, Putu Wahyu Kusuma
    Yudistira, Novanto
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 611 - 615
  • [25] Self-Supervised Masking for Unsupervised Anomaly Detection and Localization
    Huang, Chaoqin
    Xu, Qinwei
    Wang, Yanfeng
    Wang, Yu
    Zhang, Ya
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4426 - 4438
  • [26] Self-Supervised Dialogue Learning
    Wu, Jiawei
    Wang, Xin
    Wang, William Yang
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3857 - 3867
  • [27] Self-supervised learning model
    Saga, Kazushie
    Sugasaka, Tamami
    Sekiguchi, Minoru
    Fujitsu Scientific and Technical Journal, 1993, 29 (03): : 209 - 216
  • [28] Longitudinal self-supervised learning
    Zhao, Qingyu
    Liu, Zixuan
    Adeli, Ehsan
    Pohl, Kilian M.
    MEDICAL IMAGE ANALYSIS, 2021, 71
  • [29] Credal Self-Supervised Learning
    Lienen, Julian
    Huellermeier, Eyke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [30] Self-Supervised Learning for Recommendation
    Huang, Chao
    Xia, Lianghao
    Wang, Xiang
    He, Xiangnan
    Yin, Dawei
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 5136 - 5139