An efficient segmented quantization for graph neural networks

被引:0
|
作者
Yue Dai
Xulong Tang
Youtao Zhang
机构
[1] University of Pittsburgh,Department of Computer Science
关键词
Graph neural network; Quantization; Accelerator;
D O I
暂无
中图分类号
学科分类号
摘要
Graph Neural Networks (GNNs) are recently developed machine learning approaches that exploit the advances in Neural Networks for a wide range of graph applications. While GNNs achieve promising inference accuracy improvements over conventional approaches, their efficiency suffers from expensive computation and intensive memory access in feature aggregation and combination phases, leading to large inference latency. Recent studies proposed mixed-precision feature quantization to address the memory access overhead. However, its linear approximation and computation complexity become the main constraints for the overall GNN accuracy and performance. In this paper, we propose segmented quantization to partition the feature range into segments and customize linear approximation within each segment based on original value density, and conduct efficient mixed-precision computing between quantized feature and full precision weights. Segmented quantization helps to achieve high inference accuracy while maintaining low computation complexity. We also devise the hardware accelerator to fully explore the benefits of segmented quantization. Our experiments show that up to 5% average accuracy and up to 6.8×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} performance improvements can be achieved over the state-of-the-art GNN accelerators.
引用
收藏
页码:461 / 473
页数:12
相关论文
共 50 条
  • [1] An efficient segmented quantization for graph neural networks
    Dai, Yue
    Tang, Xulong
    Zhang, Youtao
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2022, 4 (04) : 461 - 473
  • [2] Quantization in Graph Convolutional Neural Networks
    Ben Saad, Leila
    Beferull-Lozano, Baltasar
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1855 - 1859
  • [3] Flexible Quantization for Efficient Convolutional Neural Networks
    Zacchigna, Federico Giordano
    Lew, Sergio
    Lutenberg, Ariel
    ELECTRONICS, 2024, 13 (10)
  • [4] Bit Efficient Quantization for Deep Neural Networks
    Nayak, Prateeth
    Zhang, David
    Chai, Sek
    FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 52 - 56
  • [5] Efficient Ensembles of Graph Neural Networks
    Nagarajan, Amrit
    Stevens, Jacob R.
    Raghunathan, Anand
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 187 - 192
  • [6] Re-quantization based binary graph neural networks
    Kai-Lang YAO
    Wu-Jun LI
    Science China(Information Sciences), 2024, 67 (07) : 160 - 171
  • [7] Re-quantization based binary graph neural networks
    Yao, Kai-Lang
    Li, Wu-Jun
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (07)
  • [8] Re-quantization based binary graph neural networks
    Kai-Lang YAO
    Wu-Jun LI
    Science China(Information Sciences), 2024, (07) : 160 - 171
  • [9] Space Efficient Quantization for Deep Convolutional Neural Networks
    Zhao, Dong-Di
    Li, Fan
    Sharif, Kashif
    Xia, Guang-Min
    Wang, Yu
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2019, 34 (02) : 305 - 317
  • [10] Space Efficient Quantization for Deep Convolutional Neural Networks
    Dong-Di Zhao
    Fan Li
    Kashif Sharif
    Guang-Min Xia
    Yu Wang
    Journal of Computer Science and Technology, 2019, 34 : 305 - 317