An efficient segmented quantization for graph neural networks

被引:0
|
作者
Yue Dai
Xulong Tang
Youtao Zhang
机构
[1] University of Pittsburgh,Department of Computer Science
关键词
Graph neural network; Quantization; Accelerator;
D O I
暂无
中图分类号
学科分类号
摘要
Graph Neural Networks (GNNs) are recently developed machine learning approaches that exploit the advances in Neural Networks for a wide range of graph applications. While GNNs achieve promising inference accuracy improvements over conventional approaches, their efficiency suffers from expensive computation and intensive memory access in feature aggregation and combination phases, leading to large inference latency. Recent studies proposed mixed-precision feature quantization to address the memory access overhead. However, its linear approximation and computation complexity become the main constraints for the overall GNN accuracy and performance. In this paper, we propose segmented quantization to partition the feature range into segments and customize linear approximation within each segment based on original value density, and conduct efficient mixed-precision computing between quantized feature and full precision weights. Segmented quantization helps to achieve high inference accuracy while maintaining low computation complexity. We also devise the hardware accelerator to fully explore the benefits of segmented quantization. Our experiments show that up to 5% average accuracy and up to 6.8×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times$$\end{document} performance improvements can be achieved over the state-of-the-art GNN accelerators.
引用
收藏
页码:461 / 473
页数:12
相关论文
共 50 条
  • [21] SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks
    Faraone, Julian
    Fraser, Nicholas
    Blott, Michaela
    Leong, Philip H. W.
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4300 - 4309
  • [22] Pse: mixed quantization framework of neural networks for efficient deployment
    Yang, Yingqing
    Tian, Guanzhong
    Liu, Mingyuan
    Chen, Yihao
    Chen, Jun
    Liu, Yong
    Pan, Yu
    Ma, Longhua
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (06)
  • [23] Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
    Shu-Chang Zhou
    Yu-Zhi Wang
    He Wen
    Qin-Yao He
    Yu-Heng Zou
    Journal of Computer Science and Technology, 2017, 32 : 667 - 682
  • [24] Pse: mixed quantization framework of neural networks for efficient deployment
    Yingqing Yang
    Guanzhong Tian
    Mingyuan Liu
    Yihao Chen
    Jun Chen
    Yong Liu
    Yu Pan
    Longhua Ma
    Journal of Real-Time Image Processing, 2023, 20
  • [25] Efficient and Interpretable Robot Manipulation With Graph Neural Networks
    Lin, Yixin
    Wang, Austin S.
    Undersander, Eric
    Rai, Akshara
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 2740 - 2747
  • [26] Efficient Streaming Subgraph Isomorphism with Graph Neural Networks
    Chi Thang Duong
    Trung Dung Hoang
    Yin, Hongzhi
    Weidlich, Matthias
    Quoc Viet Hung Nguyen
    Aberer, Karl
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (05): : 730 - 742
  • [27] Efficient Non-Sampling Graph Neural Networks
    Ji, Jianchao
    Li, Zelong
    Xu, Shuyuan
    Ge, Yingqiang
    Tan, Juntao
    Zhang, Yongfeng
    INFORMATION, 2023, 14 (08)
  • [28] Efficient Training of Graph Neural Networks on Large Graphs
    Shen, Yanyan
    Chen, Lei
    Fang, Jingzhi
    Zhang, Xin
    Gao, Shihong
    Yin, Hongbo
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (12): : 4237 - 4240
  • [29] Accurate, efficient and scalable training of Graph Neural Networks
    Zeng, Hanqing
    Zhou, Hongkuan
    Srivastava, Ajitesh
    Kannan, Rajgopal
    Prasanna, Viktor
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 147 : 166 - 183
  • [30] GraphPI: Efficient Protein Inference with Graph Neural Networks
    Ma, Zheng
    Chen, Jiazhen
    Xin, Lei
    Ghodsi, Ali
    JOURNAL OF PROTEOME RESEARCH, 2024, 23 (11) : 4821 - 4834