Improving Visual Representation Learning through Perceptual Understanding

被引:2
|
作者
Tukra, Samyakh [1 ]
Hoffman, Frederick [1 ]
Chatfield, Ken [1 ]
机构
[1] Tractable AI, London, England
关键词
D O I
10.1109/CVPR52729.2023.01392
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an extension to masked autoencoders (MAE) which improves on the representations learnt by the model by explicitly encouraging the learning of higher scene-level features. We do this by: (i) the introduction of a perceptual similarity term between generated and real images (ii) incorporating several techniques from the adversarial training literature including multi-scale training and adaptive discriminator augmentation. The combination of these results in not only better pixel reconstruction but also representations which appear to capture better higher-level details within images. More consequentially, we show how our method, Perceptual MAE, leads to better performance when used for downstream tasks outperforming previous methods. We achieve 78.1% top-1 accuracy linear probing on ImageNet-1K and up to 88.1% when fine-tuning, with similar results for other downstream tasks, all without use of additional pre-trained models or data.
引用
收藏
页码:14486 / 14495
页数:10
相关论文
共 50 条
  • [11] Visual perceptual learning
    Lu, Zhong-Lin
    Hua, Tianmiao
    Huang, Chang-Bing
    Zhou, Yifeng
    Dosher, Barbara Anne
    NEUROBIOLOGY OF LEARNING AND MEMORY, 2011, 95 (02) : 145 - 151
  • [12] Visual perceptual learning
    Shi, ZZ
    Li, QY
    Zheng, Z
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : PL75 - PL80
  • [13] Perceptual representation of visual stimuli.
    Bogdanov, S
    Maddox, WT
    JOURNAL OF MATHEMATICAL PSYCHOLOGY, 1998, 42 (04) : 479 - 480
  • [14] Emotional learning promotes perceptual predictions by remodeling stimulus representation in visual cortex
    Meaux, E.
    Sterpenich, V.
    Vuilleumier, P.
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [15] Emotional learning promotes perceptual predictions by remodeling stimulus representation in visual cortex
    E. Meaux
    V. Sterpenich
    P. Vuilleumier
    Scientific Reports, 9
  • [16] Understanding Robust Learning through the Lens of Representation Similarities
    Cianfarani, Christian
    Bhagoji, Arjun Nitin
    Sehwag, Vikash
    Zhao, Ben
    Zheng, Haitao
    Mittal, Prateek
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [17] Understanding visual scripts: Improving collaboration through modular programming
    Davis, Daniel
    Burry, Jane
    Burry, Mark
    INTERNATIONAL JOURNAL OF ARCHITECTURAL COMPUTING, 2011, 9 (04) : 361 - 375
  • [18] Perceptual Learning: How Does the Visual Circuit Change through Experience?
    Seitz, Aaron R.
    CURRENT BIOLOGY, 2020, 30 (21) : R1309 - R1311
  • [19] On the representation of perceptual knowledge for understanding reference expressions
    Spanger, Philipp
    Tokunaga, Takenobu
    LARGE-SCALE KNOWLEDGE RESOURCES: CONSTRUCTION AND APPLICATION, 2008, 4938 : 280 - 294
  • [20] Visual Perceptual Learning and Models
    Dosher, Barbara
    Lu, Zhong-Lin
    ANNUAL REVIEW OF VISION SCIENCE, VOL 3, 2017, 3 : 343 - 363