PWLT: Pyramid Window-based Lightweight Transformer for image classification

被引:1
|
作者
Mo, Yuwei [1 ,2 ]
Zuo, Pengfei [1 ,2 ]
Zhou, Quan [1 ,2 ]
Mo, Zhiyi [2 ]
Fan, Yawen [1 ]
Zhang, Suofei [3 ]
Kang, Bin [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Natl Engn Res Ctr Commun & Networking, Nanjing, Peoples R China
[2] Wuzhou Univ, Univ Key Lab Intelligent Software, Guangxi Coll, Wuzhou, Peoples R China
[3] Nanjing Univ Posts & Telecommun, Dept Internet Things, Nanjing, Peoples R China
关键词
Image classification; Lightweight vision transformer; Pyramid window; Self-attention;
D O I
10.1016/j.compeleceng.2024.109209
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, vision Transformers (ViTs) have achieved remarkable progress for image classification. As the computational cost of self -attention adopted in ViTs is quadratic with respect to the number of input tokens, some window -based ViTs have been proposed to solve this issue. However, these methods limit the computation of self -attention into spatial -constrained local windows, losing capability to encode image -based global interactions. Additionally, using fixed -size window always suffers the limitation of single -scale representation that is unsuitable for object recognition with variable scales. To address these problems, this paper describes a Pyramid Window -based Lightweight Transformer, namely PWLT, for image classification. Specifically, to address the need for multi -scale information, we employ windows of different sizes to encode objects with varying scales. To restore the relationships between different windows and explore global context, we introduce a dual self -attention scheme that utilizes local -to -global attention to reestablish these relationships. The extensive experiments on ImageNet-1K and CIFAR100 datasets demonstrate the effectiveness of our PWLT for image classification.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Window-based transformer generative adversarial network for autonomous underwater image enhancement
    Ummar, Mehnaz
    Dharejo, Fayaz Ali
    Alawode, Basit
    Mahbub, Taslim
    Piran, Md. Jalil
    Javed, Sajid
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [2] MWFormer: Mesh Understanding with Window-based Transformer
    Peng, Hao-Yang
    Guo, Meng-Hao
    Liu, Zheng-Ning
    Yang, Yong-Liang
    Mu, Tai-Jiang
    COMPUTERS & GRAPHICS-UK, 2023, 115 : 382 - 391
  • [3] Mwformer: Mesh Understanding with Window-Based Transformer
    Peng, Haoyang
    Guo, Meng-Hao
    Liu, Zheng-Ning
    Yang, Yong-Liang
    Mu, Tai-Jiang
    SSRN, 2023,
  • [4] A Lightweight Pyramid Transformer for High-Resolution SAR Image-Based Building Classification in Port Regions
    Zhang, Bo
    Wu, Qian
    Wu, Fan
    Huang, Jiajia
    Wang, Chao
    REMOTE SENSING, 2024, 16 (17)
  • [5] PSVT: Pyramid Shifted Window based Vision Transformer for cardiac image segmentation
    Zhang, Xingyu
    Liu, Jiacheng
    Xian, Xiaoli
    Chen, Bo
    Li, Dong
    Yang, Fei
    Zhang, Lei
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 102
  • [6] SWPT: Spherical Window-Based Point Cloud Transformer
    Guo, Xindong
    Sun, Yu
    Zhao, Rong
    Kuang, Liqun
    Han, Xie
    COMPUTER VISION - ACCV 2022, PT I, 2023, 13841 : 396 - 412
  • [7] Window-based image registration using variable window sizes
    Krutz, Andreas
    Frater, Michael
    Sikora, Thomas
    2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 2621 - +
  • [8] Hyperspectral Image Classification via Spatial Window-Based Multiview Intact Feature Learning
    Zhao, Yue
    Cheung, Yiu-ming
    You, Xinge
    Peng, Qinmu
    Peng, Jiangtao
    Yuan, Peipei
    Shi, Yufeng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (03): : 2294 - 2306
  • [9] PSFormer: Pyramid Superpixel Transformer for Hyperspectral Image Classification
    Zou, Jiaqi
    He, Wei
    Zhang, Hongyan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [10] A Lightweight Transformer Network for Hyperspectral Image Classification
    Zhang, Xuming
    Su, Yuanchao
    Gao, Lianru
    Bruzzone, Lorenzo
    Gu, Xingfa
    Tian, Qingjiu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61