An efficient segmented quantization for graph neural networks

被引：0

作者：

Yue Dai

Xulong Tang

Youtao Zhang

机构：

[1] University of Pittsburgh,Department of Computer Science

来源：

CCF Transactions on High Performance Computing | 2022年 / 4卷

关键词：

Graph neural network; Quantization; Accelerator;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Graph Neural Networks (GNNs) are recently developed machine learning approaches that exploit the advances in Neural Networks for a wide range of graph applications. While GNNs achieve promising inference accuracy improvements over conventional approaches, their efficiency suffers from expensive computation and intensive memory access in feature aggregation and combination phases, leading to large inference latency. Recent studies proposed mixed-precision feature quantization to address the memory access overhead. However, its linear approximation and computation complexity become the main constraints for the overall GNN accuracy and performance. In this paper, we propose segmented quantization to partition the feature range into segments and customize linear approximation within each segment based on original value density, and conduct efficient mixed-precision computing between quantized feature and full precision weights. Segmented quantization helps to achieve high inference accuracy while maintaining low computation complexity. We also devise the hardware accelerator to fully explore the benefits of segmented quantization. Our experiments show that up to 5% average accuracy and up to 6.8×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} performance improvements can be achieved over the state-of-the-art GNN accelerators.

引用

页码：461 / 473

页数：12

共 50 条

[21] SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks
Faraone, Julian
Fraser, Nicholas
Blott, Michaela
Leong, Philip H. W.
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4300 - 4309
[22] Pse: mixed quantization framework of neural networks for efficient deployment
Yang, Yingqing
Tian, Guanzhong
Liu, Mingyuan
Chen, Yihao
Chen, Jun
Liu, Yong
Pan, Yu
Ma, Longhua
JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (06)
[23] Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
Shu-Chang Zhou
Yu-Zhi Wang
He Wen
Qin-Yao He
Yu-Heng Zou
Journal of Computer Science and Technology, 2017, 32 : 667 - 682
[24] Pse: mixed quantization framework of neural networks for efficient deployment
Yingqing Yang
Guanzhong Tian
Mingyuan Liu
Yihao Chen
Jun Chen
Yong Liu
Yu Pan
Longhua Ma
Journal of Real-Time Image Processing, 2023, 20
[25] Efficient and Interpretable Robot Manipulation With Graph Neural Networks
Lin, Yixin
Wang, Austin S.
Undersander, Eric
Rai, Akshara
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 2740 - 2747
[26] Efficient Streaming Subgraph Isomorphism with Graph Neural Networks
Chi Thang Duong
Trung Dung Hoang
Yin, Hongzhi
Weidlich, Matthias
Quoc Viet Hung Nguyen
Aberer, Karl
PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (05): : 730 - 742
[27] Efficient Non-Sampling Graph Neural Networks
Ji, Jianchao
Li, Zelong
Xu, Shuyuan
Ge, Yingqiang
Tan, Juntao
Zhang, Yongfeng
INFORMATION, 2023, 14 (08)
[28] Efficient Training of Graph Neural Networks on Large Graphs
Shen, Yanyan
Chen, Lei
Fang, Jingzhi
Zhang, Xin
Gao, Shihong
Yin, Hongbo
PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (12): : 4237 - 4240
[29] Accurate, efficient and scalable training of Graph Neural Networks
Zeng, Hanqing
Zhou, Hongkuan
Srivastava, Ajitesh
Kannan, Rajgopal
Prasanna, Viktor
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 147 : 166 - 183
[30] GraphPI: Efficient Protein Inference with Graph Neural Networks
Ma, Zheng
Chen, Jiazhen
Xin, Lei
Ghodsi, Ali
JOURNAL OF PROTEOME RESEARCH, 2024, 23 (11) : 4821 - 4834

← 1 2 3 4 5 →