ReViT: Vision Transformer Accelerator With Reconfigurable Semantic-Aware Differential Attention

被引:0
|
作者
Zou, Xiaofeng [1 ]
Chen, Cen [1 ,2 ]
Shao, Hongen [1 ]
Wang, Qinyu [1 ]
Zhuang, Xiaobin [1 ]
Li, Yangfan [3 ]
Li, Keqin [4 ]
机构
[1] South China Univ Technol, Sch Future Technol, Guangzhou 510641, Peoples R China
[2] Pazhou Lab, Guangzhou 510335, Peoples R China
[3] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[4] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Semantics; Transformers; Visualization; Computer vision; Computational modeling; Attention mechanisms; Dogs; Computers; Snow; Performance evaluation; Hardware accelerator; vision transformers; software-hardware co-design; HIERARCHIES;
D O I
10.1109/TC.2024.3504263
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While vision transformers (ViTs) have continued to achieve new milestones in computer vision, their complicated network architectures with high computation and memory costs have hindered their deployment on resource-limited edge devices. Some customized accelerators have been proposed to accelerate the execution of ViTs, achieving improved performance with reduced energy consumption. However, these approaches utilize flattened attention mechanisms and ignore the inherent hierarchical visual semantics in images. In this work, we conduct a thorough analysis of hierarchical visual semantics in real-world images, revealing opportunities and challenges of leveraging visual semantics to accelerate ViTs. We propose ReViT, a systematic algorithm and architecture co-design approach, which aims to exploit the visual semantics to accelerate ViTs. Our proposed algorithm can leverage the same semantic class with strong feature similarity to reduce computation and communication in a differential attention mechanism, and support the semantic-aware attention efficiently. A novel dedicated architecture is designed to support the proposed algorithm and translate it into performance improvements. Moreover, we propose an efficient execution dataflow to alleviate workload imbalance and maximize hardware utilization. ReViT opens new directions for accelerating ViTs by exploring the underlying visual semantics of images. ReViT gains an average of 2.3x speedup and 3.6x energy efficiency over state-of-the-art ViT accelerators.
引用
收藏
页码:1079 / 1093
页数:15
相关论文
共 48 条
  • [21] WASMaker: Differential Testing of WebAssembly Runtimes via Semantic-Aware Binary Generation
    Cao, Shangtong
    He, Ningyu
    She, Xinyu
    Zhang, Yixuan
    Zhang, Mu
    Wang, Haoyu
    PROCEEDINGS OF THE 33RD ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2024, 2024, : 1262 - 1273
  • [22] Vision-Language Action Knowledge Learning for Semantic-Aware Action Quality Assessment
    Xu, Huangbiao
    Ke, Xiao
    Li, Yuezhou
    Xu, Rui
    Wu, Huanqi
    Lin, Xiaofeng
    Guo, Wenzhong
    COMPUTER VISION - ECCV 2024, PT XLII, 2025, 15100 : 423 - 440
  • [23] Semantic-Aware Vision-Assisted Integrated Sensing and Communication: Architecture and Resource Allocation
    Lu, Yang
    Mao, Weihao
    Du, Hongyang
    Dobre, Octavia A.
    Niyato, Dusit
    Ding, Zhiguo
    IEEE WIRELESS COMMUNICATIONS, 2024, 31 (03) : 302 - 308
  • [24] DeSeal: Semantic-Aware Seal2Clear Attention for Document Seal Removal
    Liu, Yifan
    Huang, Jiancheng
    Chen, Shifeng
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1702 - 1706
  • [25] SMTDKD: A Semantic-Aware Multimodal Transformer Fusion Decoupled Knowledge Distillation Method for Action Recognition
    Quan, Zhenzhen
    Chen, Qingshan
    Wang, Wei
    Zhang, Moyan
    Li, Xiang
    Li, Yujun
    Liu, Zhi
    IEEE SENSORS JOURNAL, 2024, 24 (02) : 2289 - 2304
  • [26] A Dual Semantic-Aware Recurrent Global-Adaptive Network for Vision-and-Language Navigation
    Wang, Liuyi
    He, Zongtao
    Tang, Jiagui
    Dang, Ronghao
    Wang, Naijia
    Liu, Chengju
    Chen, Qijun
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1479 - 1487
  • [27] Dual-attention-based semantic-aware self-supervised monocular depth estimation
    Xu, Jinze
    Ye, Feng
    Lai, Yizong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 65579 - 65601
  • [28] End-to-end semantic-aware object retrieval based on region-wise attention
    Li, Xiu
    Jin, Kun
    Long, Rujiao
    NEUROCOMPUTING, 2019, 359 : 219 - 226
  • [29] ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation
    Xing, Daitao
    Shen, Jinglin
    Ho, Chiuman
    Tzes, Anthony
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 2983 - 2991
  • [30] A Semantic-Aware Attention and Visual Shielding Network for Cloth-Changing Person Re-Identification
    Gao, Zan
    Wei, Hongwei
    Guan, Weili
    Nie, Jie
    Wang, Meng
    Chen, Shengyong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1243 - 1257