Evolved Hierarchical Masking for Self-Supervised Learning

被引：0

作者：

Feng, Zhanzhou ^{[1
]}

Zhang, Shiliang ^{[1
,2
]}

机构：

[1] Peking Univ, Sch Comp Sci, State Key Lab Multimedia Informat Proc, Beijing 100871, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518055, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2025年 / 47卷 / 02期

关键词：

Visualization; Training; Semantics; Self-supervised learning; Semantic segmentation; Image classification; Computational modeling; Neurons; Representation learning; Predictive models; masked image modeling; efficient learning; model pretraining;

D O I：

10.1109/TPAMI.2024.3490776

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Existing Masked Image Modeling methods apply fixed mask patterns to guide the self-supervised training. As those mask patterns resort to different criteria to depict image contents, sticking to a fixed pattern leads to a limited vision cues modeling capability. This paper introduces an evolved hierarchical masking method to pursue general visual cues modeling in self-supervised learning. The proposed method leverages the vision model being trained to parse the input visual cues into a hierarchy structure, which is hence adopted to generate masks accordingly. The accuracy of hierarchy is on par with the capability of the model being trained, leading to evolved mask patterns at different training stages. Initially, generated masks focus on low-level visual cues to grasp basic textures, then gradually evolve to depict higher-level cues to reinforce the learning of more complicated object semantics and contexts. Our method does not require extra pre-trained models or annotations and ensures training efficiency by evolving the training difficulty. We conduct extensive experiments on seven downstream tasks including partial-duplicate image retrieval relying on low-level details, as well as image classification and semantic segmentation that require semantic parsing capability. Experimental results demonstrate that it substantially boosts performance across these tasks. For instance, it surpasses the recent MAE by 1.1% in imageNet-1K classification and 1.4% in ADE20K segmentation with the same training epochs. We also align the proposed method with the current research focus on LLMs. The proposed approach bridges the gap with large-scale pre-training on semantic demanding tasks and enhances intricate detail perception in tasks requiring low-level feature recognition.

引用

页码：1013 / 1027

页数：15

共 50 条

[21] A Hierarchical Vision Transformer Using Overlapping Patch and Self-Supervised Learning
Ma, Yaxin
Li, Ming
Chang, Jun
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[22] Self-supervised learning for accurately modelling hierarchical evolutionary patterns of cerebrovasculature
Guo, Bin
Chen, Ying
Lin, Jinping
Huang, Bin
Bai, Xiangzhuo
Guo, Chuanliang
Gao, Bo
Gong, Qiyong
Bai, Xiangzhi
NATURE COMMUNICATIONS, 2024, 15 (01)
[23] Deep Self-Supervised Diversity Promoting Learning on Hierarchical Hyperspheres for Regularization
Kim, Youngsung
Hyun, Yoonsuk
Han, Jae-Joon
Yang, Eunho
Hwang, Sung Ju
Shin, Jinwoo
IEEE ACCESS, 2023, 11 : 146208 - 146222
[24] Gated Self-supervised Learning for Improving Supervised Learning
Fuadi, Erland Hillman
Ruslim, Aristo Renaldo
Wardhana, Putu Wahyu Kusuma
Yudistira, Novanto
2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 611 - 615
[25] Self-Supervised Masking for Unsupervised Anomaly Detection and Localization
Huang, Chaoqin
Xu, Qinwei
Wang, Yanfeng
Wang, Yu
Zhang, Ya
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 4426 - 4438
[26] Self-Supervised Dialogue Learning
Wu, Jiawei
Wang, Xin
Wang, William Yang
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3857 - 3867
[27] Self-supervised learning model
Saga, Kazushie
Sugasaka, Tamami
Sekiguchi, Minoru
Fujitsu Scientific and Technical Journal, 1993, 29 (03): : 209 - 216
[28] Longitudinal self-supervised learning
Zhao, Qingyu
Liu, Zixuan
Adeli, Ehsan
Pohl, Kilian M.
MEDICAL IMAGE ANALYSIS, 2021, 71
[29] Credal Self-Supervised Learning
Lienen, Julian
Huellermeier, Eyke
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[30] Self-Supervised Learning for Recommendation
Huang, Chao
Xia, Lianghao
Wang, Xiang
He, Xiangnan
Yin, Dawei
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 5136 - 5139

← 1 2 3 4 5 →