Peripheral Vision Transformer

被引:0
|
作者
Min, Juhong [1 ]
Zhao, Yucheng [2 ,3 ]
Luo, Chong [2 ]
Cho, Minsu [1 ]
机构
[1] Pohang Univ Sci & Technol POSTECH, Pohang, South Korea
[2] Microsoft Res Asia MSRA, Beijing, Peoples R China
[3] Univ Sci & Technol China, Hefei, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human vision possesses a special type of visual processing systems called peripheral vision. Partitioning the entire visual field into multiple contour regions based on the distance to the center of our gaze, the peripheral vision provides us the ability to perceive various visual features at different regions. In this work, we take a biologically inspired approach and explore to model peripheral vision in deep neural networks for visual recognition. We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data. We evaluate the proposed network, dubbed PerViT, on ImageNet-1K and systematically investigate the inner workings of the model for machine perception, showing that the network learns to perceive visual data similarly to the way that human vision does. The performance improvements in image classification over the baselines across different model sizes demonstrate the efficacy of the proposed method.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] VISION TRANSFORMER FOR AUTOMATIC IMAGE RECOGNITION OF PERIPHERAL BLOOD CELLS
    Barrera, Kevin I.
    Merino, Anna
    Alferez, Edwin S.
    Molina, Angel
    Rodellar, Jose
    INTERNATIONAL JOURNAL OF LABORATORY HEMATOLOGY, 2023, 45 : 11 - 11
  • [2] Vision Transformer for Pansharpening
    Meng, Xiangchao
    Wang, Nan
    Shao, Feng
    Li, Shutao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [3] A Survey on Vision Transformer
    Han, Kai
    Wang, Yunhe
    Chen, Hanting
    Chen, Xinghao
    Guo, Jianyuan
    Liu, Zhenhua
    Tang, Yehui
    Xiao, An
    Xu, Chunjing
    Xu, Yixing
    Yang, Zhaohui
    Zhang, Yiman
    Tao, Dacheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) : 87 - 110
  • [4] Super Vision Transformer
    Lin, Mingbao
    Chen, Mengzhao
    Zhang, Yuxin
    Shen, Chunhua
    Ji, Rongrong
    Cao, Liujuan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (12) : 3136 - 3151
  • [5] Dual Vision Transformer
    Yao, Ting
    Li, Yehao
    Pan, Yingwei
    Wang, Yu
    Zhang, Xiao-Ping
    Mei, Tao
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 10870 - 10882
  • [6] Super Vision Transformer
    Mingbao Lin
    Mengzhao Chen
    Yuxin Zhang
    Chunhua Shen
    Rongrong Ji
    Liujuan Cao
    International Journal of Computer Vision, 2023, 131 : 3136 - 3151
  • [7] Vicinity Vision Transformer
    Sun W.
    Qin Z.
    Deng H.
    Wang J.
    Zhang Y.
    Zhang K.
    Barnes N.
    Birchfield S.
    Kong L.
    Zhong Y.
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45 (10) : 12635 - 12649
  • [8] DropKey for Vision Transformer
    Li, Bonan
    Hu, Yinhan
    Nie, Xuecheng
    Han, Congying
    Jiang, Xiangjian
    Guo, Tiande
    Liu, Luocji
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22700 - 22709
  • [9] Sufficient Vision Transformer
    Cheng, Zhi
    Su, Xiu
    Wang, Xueyu
    You, Shan
    Xu, Chang
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 190 - 200
  • [10] Peripheral vision
    不详
    BRITISH MEDICAL JOURNAL, 1932, 1932 (01): : 622 - 622