PSQ: An Automatic Search Framework for Data-Free Quantization on PIM-based Architecture

被引:1
|
作者
Liu, Fangxin [1 ,2 ]
Yang, Ning [1 ,2 ]
Jiang, Li [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Shanghai Qi Zhi Inst, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCD58817.2023.00084
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Crossbar-based Process-In-Memory (PIM) architecture has been considered as a promising solution for Deep Neural Networks (DNNs) acceleration. Due to the ever increasing model size and computational budget of DNNs, model compression is a critical step for the deployment of DNNs. However, when deploying DNNs in PIM architectures, fine-grained quantization on DNN weight matrices is not easy due to the inflexible data path inside the crossbar. To this end, in this paper, we study the feasibility and efficiency of a novel fine-grained quantization scheme called PSQ for PIM-based design. The scheme tightly combines the search principle of quantization and the PIM architecture to provide smooth hardware-friendly quantization. We leverage the weight locality and the variety of weight distributions in different blocks to facilitate the fine-grained quantization process. Meanwhile, we propose a lightweight search framework to adaptively allocate the quantization parameters (e.g., scale, bitwidth, etc.). During the search process, suitable quantization parameters are assigned directly to each fine-grained block, keeping the weight distributions before and after quantization as close as possible, thus minimizing the quantization errors. Our evaluation shows that the proposed PSQ achieves 3.5x reduction in occupied crossbars while the accuracy loss is negligible. What's more, PSQ can perform such a process in just a few seconds on a single CPU, without model retraining and expensive computation.
引用
收藏
页码:507 / 514
页数:8
相关论文
共 50 条
  • [41] Data-free quantization via mixed-precision compensation without fine-tuning
    Chen, Jun
    Bai, Shipeng
    Huang, Tianxin
    Wang, Mengmeng
    Tian, Guanzhong
    Liu, Yong
    PATTERN RECOGNITION, 2023, 143
  • [42] ZERO-SHOT LEARNING OF A CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR DATA-FREE NETWORK QUANTIZATION
    Choi, Yoojin
    El-Khamy, Mostafa
    Lee, Jungwon
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3552 - 3556
  • [43] An Automated Quantization Framework for High-Utilization RRAM-Based PIM
    Li, Bing
    Qu, Songyun
    Wang, Ying
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (03) : 583 - 596
  • [44] Data-Free Network Compression via Parametric Non-uniform Mixed Precision Quantization
    Chikin, Vladimir
    Antiukh, Mikhail
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 450 - 459
  • [45] Adaptive knowledge transfer for data-free low-bit quantization via tiered collaborative
    Lin, Tong
    Li, Chenyang
    Qian, Bo
    Yang, Xinyu
    Wei, Xing
    Yang, Zelin
    NEUROCOMPUTING, 2025, 638
  • [46] A flexible data-free framework for structure-based de novo drug design with reinforcement learning
    Du, Hongyan
    Jiang, Dejun
    Zhang, Odin
    Wu, Zhenxing
    Gao, Junbo
    Zhang, Xujun
    Wang, Xiaorui
    Deng, Yafeng
    Kang, Yu
    Li, Dan
    Pan, Peichen
    Hsieh, Chang-Yu
    Hou, Tingjun
    CHEMICAL SCIENCE, 2023, 14 (43) : 12166 - 12181
  • [47] Explanation-based data-free model extraction attacks
    Anli Yan
    Ruitao Hou
    Hongyang Yan
    Xiaozhang Liu
    World Wide Web, 2023, 26 : 3081 - 3092
  • [48] Explanation-based data-free model extraction attacks
    Yan, Anli
    Hou, Ruitao
    Yan, Hongyang
    Liu, Xiaozhang
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (05): : 3081 - 3092
  • [49] Factors influencing the acceptance and use of a South African data-free job search application
    Mangadi, Tsholofelo
    Petersen, Fazlyn
    SOUTH AFRICAN JOURNAL OF INFORMATION MANAGEMENT, 2024, 26 (01):
  • [50] Latent Coreset Sampling based Data-Free Continual Learning
    Wang, Zhuoyi
    Li, Dingcheng
    Li, Ping
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 2078 - 2087