PSQ: An Automatic Search Framework for Data-Free Quantization on PIM-based Architecture

被引：1

作者：

Liu, Fangxin ^{[1
,2
]}

Yang, Ning ^{[1
,2
]}

Jiang, Li ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

[2] Shanghai Qi Zhi Inst, Shanghai, Peoples R China

来源：

2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ICCD58817.2023.00084

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Crossbar-based Process-In-Memory (PIM) architecture has been considered as a promising solution for Deep Neural Networks (DNNs) acceleration. Due to the ever increasing model size and computational budget of DNNs, model compression is a critical step for the deployment of DNNs. However, when deploying DNNs in PIM architectures, fine-grained quantization on DNN weight matrices is not easy due to the inflexible data path inside the crossbar. To this end, in this paper, we study the feasibility and efficiency of a novel fine-grained quantization scheme called PSQ for PIM-based design. The scheme tightly combines the search principle of quantization and the PIM architecture to provide smooth hardware-friendly quantization. We leverage the weight locality and the variety of weight distributions in different blocks to facilitate the fine-grained quantization process. Meanwhile, we propose a lightweight search framework to adaptively allocate the quantization parameters (e.g., scale, bitwidth, etc.). During the search process, suitable quantization parameters are assigned directly to each fine-grained block, keeping the weight distributions before and after quantization as close as possible, thus minimizing the quantization errors. Our evaluation shows that the proposed PSQ achieves 3.5x reduction in occupied crossbars while the accuracy loss is negligible. What's more, PSQ can perform such a process in just a few seconds on a single CPU, without model retraining and expensive computation.

引用

页码：507 / 514

页数：8

共 50 条

[41] Data-free quantization via mixed-precision compensation without fine-tuning
Chen, Jun
Bai, Shipeng
Huang, Tianxin
Wang, Mengmeng
Tian, Guanzhong
Liu, Yong
PATTERN RECOGNITION, 2023, 143
[42] ZERO-SHOT LEARNING OF A CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR DATA-FREE NETWORK QUANTIZATION
Choi, Yoojin
El-Khamy, Mostafa
Lee, Jungwon
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3552 - 3556
[43] An Automated Quantization Framework for High-Utilization RRAM-Based PIM
Li, Bing
Qu, Songyun
Wang, Ying
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (03) : 583 - 596
[44] Data-Free Network Compression via Parametric Non-uniform Mixed Precision Quantization
Chikin, Vladimir
Antiukh, Mikhail
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 450 - 459
[45] Adaptive knowledge transfer for data-free low-bit quantization via tiered collaborative
Lin, Tong
Li, Chenyang
Qian, Bo
Yang, Xinyu
Wei, Xing
Yang, Zelin
NEUROCOMPUTING, 2025, 638
[46] A flexible data-free framework for structure-based de novo drug design with reinforcement learning
Du, Hongyan
Jiang, Dejun
Zhang, Odin
Wu, Zhenxing
Gao, Junbo
Zhang, Xujun
Wang, Xiaorui
Deng, Yafeng
Kang, Yu
Li, Dan
Pan, Peichen
Hsieh, Chang-Yu
Hou, Tingjun
CHEMICAL SCIENCE, 2023, 14 (43) : 12166 - 12181
[47] Explanation-based data-free model extraction attacks
Anli Yan
Ruitao Hou
Hongyang Yan
Xiaozhang Liu
World Wide Web, 2023, 26 : 3081 - 3092
[48] Explanation-based data-free model extraction attacks
Yan, Anli
Hou, Ruitao
Yan, Hongyang
Liu, Xiaozhang
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (05): : 3081 - 3092
[49] Factors influencing the acceptance and use of a South African data-free job search application
Mangadi, Tsholofelo
Petersen, Fazlyn
SOUTH AFRICAN JOURNAL OF INFORMATION MANAGEMENT, 2024, 26 (01):
[50] Latent Coreset Sampling based Data-Free Continual Learning
Wang, Zhuoyi
Li, Dingcheng
Li, Ping
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 2078 - 2087

← 1 2 3 4 5 →