ReViT: Vision Transformer Accelerator With Reconfigurable Semantic-Aware Differential Attention

被引:0
|
作者
Zou, Xiaofeng [1 ]
Chen, Cen [1 ,2 ]
Shao, Hongen [1 ]
Wang, Qinyu [1 ]
Zhuang, Xiaobin [1 ]
Li, Yangfan [3 ]
Li, Keqin [4 ]
机构
[1] South China Univ Technol, Sch Future Technol, Guangzhou 510641, Peoples R China
[2] Pazhou Lab, Guangzhou 510335, Peoples R China
[3] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[4] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Semantics; Transformers; Visualization; Computer vision; Computational modeling; Attention mechanisms; Dogs; Computers; Snow; Performance evaluation; Hardware accelerator; vision transformers; software-hardware co-design; HIERARCHIES;
D O I
10.1109/TC.2024.3504263
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While vision transformers (ViTs) have continued to achieve new milestones in computer vision, their complicated network architectures with high computation and memory costs have hindered their deployment on resource-limited edge devices. Some customized accelerators have been proposed to accelerate the execution of ViTs, achieving improved performance with reduced energy consumption. However, these approaches utilize flattened attention mechanisms and ignore the inherent hierarchical visual semantics in images. In this work, we conduct a thorough analysis of hierarchical visual semantics in real-world images, revealing opportunities and challenges of leveraging visual semantics to accelerate ViTs. We propose ReViT, a systematic algorithm and architecture co-design approach, which aims to exploit the visual semantics to accelerate ViTs. Our proposed algorithm can leverage the same semantic class with strong feature similarity to reduce computation and communication in a differential attention mechanism, and support the semantic-aware attention efficiently. A novel dedicated architecture is designed to support the proposed algorithm and translate it into performance improvements. Moreover, we propose an efficient execution dataflow to alleviate workload imbalance and maximize hardware utilization. ReViT opens new directions for accelerating ViTs by exploring the underlying visual semantics of images. ReViT gains an average of 2.3x speedup and 3.6x energy efficiency over state-of-the-art ViT accelerators.
引用
收藏
页码:1079 / 1093
页数:15
相关论文
共 48 条
  • [41] An Interpretable Target-Aware Vision Transformer for Polarimetric HRRP Target Recognition with a Novel Attention Loss
    Gao, Fan
    Lang, Ping
    Yeh, Chunmao
    Li, Zhangfeng
    Ren, Dawei
    Yang, Jian
    REMOTE SENSING, 2024, 16 (17)
  • [42] DRViT: A dynamic redundancy-aware vision transformer accelerator via algorithm and architecture co-design on FPGA
    Sun, Xiangfeng
    Zhang, Yuanting
    Wang, Qinyu
    Zou, Xiaofeng
    Liu, Yujia
    Zeng, Ziqian
    Zhuang, Huiping
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2025, 199
  • [43] Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network
    Tang, Linfeng
    Yuan, Jiteng
    Ma, Jiayi
    INFORMATION FUSION, 2022, 82 : 28 - 42
  • [44] MSRA-Net: multi-channel semantic-aware and residual attention mechanism network for unsupervised 3D image registration
    Ren, Xiaozhen
    Song, Haoyuan
    Zhang, Zihao
    Yang, Tiejun
    PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (16):
  • [45] A 28nm 343.5fps/W Vision Transformer Accelerator with Integer-Only Quantized Attention Block
    Lin, Cheng-Chen
    Lu, Wei
    Huang, Po-Tsang
    Chen, Hung-Ming
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 80 - 84
  • [46] MetaViT: Metabolism-Aware Vision Transformer for Differential Diagnosis of Parkinsonism with 18F-FDG PET
    Zhao, Lin
    Dong, Hexin
    Wu, Ping
    Lu, Jiaying
    Lu, Le
    Zhou, Jingren
    Liu, Tianming
    Zhang, Li
    Zhang, Ling
    Tang, Yuxing
    Zuo, Chuantao
    INFORMATION PROCESSING IN MEDICAL IMAGING, IPMI 2023, 2023, 13939 : 132 - 144
  • [47] CIMFormer: A Systolic CIM-Array-Based Transformer Accelerator With Token-Pruning-Aware Attention Reformulating and Principal Possibility Gathering
    Guo, Ruiqi
    Chen, Xiaofeng
    Wang, Lei
    Wang, Yang
    Sun, Hao
    Wei, Jingchuan
    Han, Huiming
    Liu, Leibo
    Wei, Shaojun
    Hu, Yang
    Yin, Shouyi
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2024, 59 (10) : 3317 - 3329
  • [48] H3DAtten: Heterogeneous 3-D Integrated Hybrid Analog and Digital Compute-in-Memory Accelerator for Vision Transformer Self-Attention
    Li, Wantong
    Manley, Madison
    Read, James
    Kaul, Ankit
    Bakir, Muhannad S.
    Yu, Shimeng
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2023, 31 (10) : 1592 - 1602