An efficient segmented quantization for graph neural networks

被引:0
|
作者
Yue Dai
Xulong Tang
Youtao Zhang
机构
[1] University of Pittsburgh,Department of Computer Science
关键词
Graph neural network; Quantization; Accelerator;
D O I
暂无
中图分类号
学科分类号
摘要
Graph Neural Networks (GNNs) are recently developed machine learning approaches that exploit the advances in Neural Networks for a wide range of graph applications. While GNNs achieve promising inference accuracy improvements over conventional approaches, their efficiency suffers from expensive computation and intensive memory access in feature aggregation and combination phases, leading to large inference latency. Recent studies proposed mixed-precision feature quantization to address the memory access overhead. However, its linear approximation and computation complexity become the main constraints for the overall GNN accuracy and performance. In this paper, we propose segmented quantization to partition the feature range into segments and customize linear approximation within each segment based on original value density, and conduct efficient mixed-precision computing between quantized feature and full precision weights. Segmented quantization helps to achieve high inference accuracy while maintaining low computation complexity. We also devise the hardware accelerator to fully explore the benefits of segmented quantization. Our experiments show that up to 5% average accuracy and up to 6.8×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} performance improvements can be achieved over the state-of-the-art GNN accelerators.
引用
收藏
页码:461 / 473
页数:12
相关论文
共 50 条
  • [41] Vector quantization of neural networks
    Chu, WC
    Bose, NK
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1998, 9 (06): : 1235 - 1245
  • [42] Vector quantization of neural networks
    Pennsylvania State Univ, University Park, United States
    IEEE Trans Neural Networks, 6 (1235-1245):
  • [43] FASTEN: Fast GPU-accelerated Segmented Matrix Multiplication for Heterogeneous Graph Neural Networks
    Zhou, Keren
    Subramanian, Karthik Ganapathi
    Lin, Po-Hsun
    Fey, Matthias
    Yin, Binqian
    Li, Jiajia
    PROCEEDINGS OF THE 38TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ACM ICS 2024, 2024, : 511 - 524
  • [44] Post-Training Quantization for Energy Efficient Realization of Deep Neural Networks
    Latotzke, Cecilia
    Balim, Batuhan
    Gemmeke, Tobias
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1559 - 1566
  • [45] Binarized Neural Networks for Resource-Efficient Hashing with Minimizing Quantization Loss
    Zheng, Feng
    Deng, Cheng
    Huang, Heng
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 1032 - 1040
  • [46] Efficient Hardware Implementation of Cellular Neural Networks with Incremental Quantization and Early Exit
    Xu, Xiaowei
    Lu, Qing
    Wang, Tianchen
    Hu, Yu
    Zhuo, Chen
    Liu, Jinglan
    Shi, Yiyu
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2018, 14 (04)
  • [47] Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
    Jacob, Benoit
    Kligys, Skirmantas
    Chen, Bo
    Zhu, Menglong
    Tang, Matthew
    Howard, Andrew
    Adam, Hartwig
    Kalenichenko, Dmitry
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2704 - 2713
  • [48] Graph neural networks
    Corso G.
    Stark H.
    Jegelka S.
    Jaakkola T.
    Barzilay R.
    Nature Reviews Methods Primers, 4 (1):
  • [49] Graph neural networks
    不详
    NATURE REVIEWS METHODS PRIMERS, 2024, 4 (01):
  • [50] From Sancus to Sancusq: staleness and quantization-aware full-graph decentralized training in graph neural networks
    Peng, Jingshu
    Liu, Qiyu
    Chen, Zhao
    Shao, Yingxia
    Shen, Yanyan
    Chen, Lei
    Cao, Jiannong
    VLDB JOURNAL, 2025, 34 (02):