Peripheral Vision Transformer

被引:0
|
作者
Min, Juhong [1 ]
Zhao, Yucheng [2 ,3 ]
Luo, Chong [2 ]
Cho, Minsu [1 ]
机构
[1] Pohang Univ Sci & Technol POSTECH, Pohang, South Korea
[2] Microsoft Res Asia MSRA, Beijing, Peoples R China
[3] Univ Sci & Technol China, Hefei, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human vision possesses a special type of visual processing systems called peripheral vision. Partitioning the entire visual field into multiple contour regions based on the distance to the center of our gaze, the peripheral vision provides us the ability to perceive various visual features at different regions. In this work, we take a biologically inspired approach and explore to model peripheral vision in deep neural networks for visual recognition. We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data. We evaluate the proposed network, dubbed PerViT, on ImageNet-1K and systematically investigate the inner workings of the model for machine perception, showing that the network learns to perceive visual data similarly to the way that human vision does. The performance improvements in image classification over the baselines across different model sizes demonstrate the efficacy of the proposed method.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Peripheral vision
    Foss, Kirsten
    Foss, Nicolai J.
    Klein, Peter G.
    ORGANIZATION STUDIES, 2007, 28 (12) : 1893 - 1912
  • [22] Peripheral Vision
    McCloskey, Stephen
    POLICY & PRACTICE-A DEVELOPMENT EDUCATION REVIEW, 2014, (18): : 103 - 106
  • [23] Peripheral vision
    Vrabel, Leigh Anne
    LIBRARY JOURNAL, 2008, 133 (16) : 56 - 56
  • [24] PERIPHERAL VISION
    Slessor, Catherine
    ARCHITECTURAL REVIEW, 2019, 245 (1461) : 92 - 100
  • [25] Peripheral Vision
    Lodge, Patrick
    AGENDA, 2022, 55 (1-2): : 80 - 82
  • [26] PERIPHERAL VISION
    Beaumont, Eleanor
    ARCHITECTURAL REVIEW, 2017, 242 (1446) : 51 - 55
  • [27] Peripheral Vision
    Shuter, Susan
    DESCANT, 2010, 41 (04): : 112 - 123
  • [28] Building Extraction With Vision Transformer
    Wang, Libo
    Fang, Shenghui
    Meng, Xiaoliang
    Li, Rui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [29] ViTT: Vision Transformer Tracker
    Zhu, Xiaoning
    Jia, Yannan
    Jian, Sun
    Gu, Lize
    Pu, Zhang
    SENSORS, 2021, 21 (16)
  • [30] Vision Transformer with Progressive Sampling
    Yue, Xiaoyu
    Sun, Shuyang
    Kuang, Zhanghui
    Wei, Meng
    Torr, Philip
    Zhang, Wayne
    Lin, Dahua
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 377 - 386