PSQ: An Automatic Search Framework for Data-Free Quantization on PIM-based Architecture

被引:1
|
作者
Liu, Fangxin [1 ,2 ]
Yang, Ning [1 ,2 ]
Jiang, Li [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Shanghai Qi Zhi Inst, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCD58817.2023.00084
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Crossbar-based Process-In-Memory (PIM) architecture has been considered as a promising solution for Deep Neural Networks (DNNs) acceleration. Due to the ever increasing model size and computational budget of DNNs, model compression is a critical step for the deployment of DNNs. However, when deploying DNNs in PIM architectures, fine-grained quantization on DNN weight matrices is not easy due to the inflexible data path inside the crossbar. To this end, in this paper, we study the feasibility and efficiency of a novel fine-grained quantization scheme called PSQ for PIM-based design. The scheme tightly combines the search principle of quantization and the PIM architecture to provide smooth hardware-friendly quantization. We leverage the weight locality and the variety of weight distributions in different blocks to facilitate the fine-grained quantization process. Meanwhile, we propose a lightweight search framework to adaptively allocate the quantization parameters (e.g., scale, bitwidth, etc.). During the search process, suitable quantization parameters are assigned directly to each fine-grained block, keeping the weight distributions before and after quantization as close as possible, thus minimizing the quantization errors. Our evaluation shows that the proposed PSQ achieves 3.5x reduction in occupied crossbars while the accuracy loss is negligible. What's more, PSQ can perform such a process in just a few seconds on a single CPU, without model retraining and expensive computation.
引用
收藏
页码:507 / 514
页数:8
相关论文
共 50 条
  • [21] An Empirical study of Data-Free Quantization's Tuning Robustness
    Chen, Hong
    Wen, Yuxuan
    Ding, Yifu
    Yang, Zhen
    Guo, Yufei
    Qin, Haotong
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 171 - 177
  • [22] ChaoPIM: A PIM-based Protection Framework for DNN Accelerators Using Chaotic Encryption
    Lin, Ning
    Chen, Xiaoming
    Xia, Chunwei
    Ye, Jing
    Li, Xiaowei
    2021 IEEE 30TH ASIAN TEST SYMPOSIUM (ATS 2021), 2021, : 1 - 6
  • [23] Robustness-Guided Image Synthesis for Data-Free Quantization
    Bai, Jianhong
    Yang, Yuchen
    Chu, Huanpeng
    Wang, Hualiang
    Liu, Zuozhu
    Chen, Ruizhe
    He, Xiaoxuan
    Mu, Lianrui
    Cai, Chengfei
    Hu, Haoji
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 10971 - 10979
  • [24] Supporting Moderate Data Dependency, Position Dependency, and Divergence in PIM-Based Accelerators
    Lenjani, Marzieh
    Skadron, Kevin
    IEEE MICRO, 2022, 42 (01) : 108 - 115
  • [25] GraphP: Reducing Communication for PIM-based Graph Processing with Efficient Data Partition
    Zhang, Mingxing
    Zhuo, Youwei
    Wang, Chao
    Gao, Mingyu
    Wu, Yongwei
    Chen, Kang
    Kozyrakis, Christos
    Qian, Xuehai
    2018 24TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2018, : 544 - 557
  • [26] Enabling Highly Efficient Capsule Networks Processing Through A PIM-Based Architecture Design
    Zhang, Xingyao
    Song, Shuaiwen Leon
    Xie, Chenhao
    Wang, Jing
    Zhang, Weigong
    Fu, Xin
    2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 542 - 555
  • [27] DAC: Data-free Automatic Acceleration of Convolutional Networks
    Li, Xin
    Zhang, Shuai
    Jiang, Bolan
    Qi, Yingyong
    Chuah, Mooi Choo
    Bi, Ning
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1598 - 1606
  • [28] Learning to Generate Diverse Data From a Temporal Perspective for Data-Free Quantization
    Luo, Hui
    Zhang, Shuhai
    Zhuang, Zhuangwei
    Mai, Jiajie
    Tan, Mingkui
    Zhang, Jianlin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9484 - 9498
  • [29] Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization
    Gao, Yangcheng
    Zhang, Zhao
    Hong, Richang
    Zhang, Haijun
    Fan, Jicong
    Yan, Shuicheng
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 141 - 150
  • [30] A Data-Free Distillation Framework for Adaptive Bitrate Algorithms
    Huang T.-C.
    Li C.-Y.
    Zhang R.-X.
    Li W.-Z.
    Sun L.-F.
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (01): : 113 - 130