An efficient segmented quantization for graph neural networks

被引：0

作者：

Yue Dai

Xulong Tang

Youtao Zhang

机构：

[1] University of Pittsburgh,Department of Computer Science

来源：

CCF Transactions on High Performance Computing | 2022年 / 4卷

关键词：

Graph neural network; Quantization; Accelerator;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Graph Neural Networks (GNNs) are recently developed machine learning approaches that exploit the advances in Neural Networks for a wide range of graph applications. While GNNs achieve promising inference accuracy improvements over conventional approaches, their efficiency suffers from expensive computation and intensive memory access in feature aggregation and combination phases, leading to large inference latency. Recent studies proposed mixed-precision feature quantization to address the memory access overhead. However, its linear approximation and computation complexity become the main constraints for the overall GNN accuracy and performance. In this paper, we propose segmented quantization to partition the feature range into segments and customize linear approximation within each segment based on original value density, and conduct efficient mixed-precision computing between quantized feature and full precision weights. Segmented quantization helps to achieve high inference accuracy while maintaining low computation complexity. We also devise the hardware accelerator to fully explore the benefits of segmented quantization. Our experiments show that up to 5% average accuracy and up to 6.8×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} performance improvements can be achieved over the state-of-the-art GNN accelerators.

引用

页码：461 / 473

页数：12

共 50 条

[1] An efficient segmented quantization for graph neural networks
Dai, Yue
Tang, Xulong
Zhang, Youtao
CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2022, 4 (04) : 461 - 473
[2] Quantization in Graph Convolutional Neural Networks
Ben Saad, Leila
Beferull-Lozano, Baltasar
29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1855 - 1859
[3] Flexible Quantization for Efficient Convolutional Neural Networks
Zacchigna, Federico Giordano
Lew, Sergio
Lutenberg, Ariel
ELECTRONICS, 2024, 13 (10)
[4] Bit Efficient Quantization for Deep Neural Networks
Nayak, Prateeth
Zhang, David
Chai, Sek
FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 52 - 56
[5] Efficient Ensembles of Graph Neural Networks
Nagarajan, Amrit
Stevens, Jacob R.
Raghunathan, Anand
PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 187 - 192
[6] Re-quantization based binary graph neural networks
Kai-Lang YAO
Wu-Jun LI
Science China(Information Sciences), 2024, 67 (07) : 160 - 171
[7] Re-quantization based binary graph neural networks
Yao, Kai-Lang
Li, Wu-Jun
SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (07)
[8] Re-quantization based binary graph neural networks
Kai-Lang YAO
Wu-Jun LI
Science China(Information Sciences), 2024, (07) : 160 - 171
[9] Space Efficient Quantization for Deep Convolutional Neural Networks
Zhao, Dong-Di
Li, Fan
Sharif, Kashif
Xia, Guang-Min
Wang, Yu
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2019, 34 (02) : 305 - 317
[10] Space Efficient Quantization for Deep Convolutional Neural Networks
Dong-Di Zhao
Fan Li
Kashif Sharif
Guang-Min Xia
Yu Wang
Journal of Computer Science and Technology, 2019, 34 : 305 - 317

← 1 2 3 4 5 →