Quantized Training of Gradient Boosting Decision Trees

被引:0
|
作者
Shi, Yu [1 ]
Ke, Guolin [2 ]
Chen, Zhuoming [3 ]
Zheng, Shuxin [1 ]
Liu, Tie-Yan [1 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] DP Technol, Beijing, Peoples R China
[3] Tsinghua Univ, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have witnessed significant success in Gradient Boosting Decision Trees (GBDT) for a wide range of machine learning applications. Generally, a consensus about GBDT's training algorithms is gradients and statistics are computed based on high-precision floating points. In this paper, we investigate an essentially important question which has been largely ignored by the previous literature - how many bits are needed for representing gradients in training GBDT? To solve this mystery, we propose to quantize all the high-precision gradients in a very simple yet effective way in the GBDT's training algorithm. Surprisingly, both our theoretical analysis and empirical studies show that the necessary precisions of gradients without hurting any performance can be quite low, e.g., 2 or 3 bits. With low-precision gradients, most arithmetic operations in GBDT training can be replaced by integer operations of 8, 16, or 32 bits. Promisingly, these findings may pave the way for much more efficient training of GBDT from several aspects: (1) speeding up the computation of gradient statistics in histograms; (2) compressing the communication cost of high-precision statistical information during distributed training; (3) the inspiration of utilization and development of hardware architectures which well support low-precision computation for GBDT training. Benchmarked on CPUs, GPUs, and distributed clusters, we observe up to 2x speedup of our simple quantization strategy compared with SOTA GBDT systems on extensive datasets, demonstrating the effectiveness and potential of the low-precision training of GBDT. The code will be released to the official repository of LightGBM.(4)
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Classification of Pesticide Residues in Sorghum Based on Hyperspectral and Gradient Boosting Decision Trees
    Hu, Xinjun
    Zhang, Jiahong
    Lei, Yu
    Tian, Jianping
    Peng, Jianheng
    Chen, Manjiao
    JOURNAL OF FOOD SAFETY, 2024, 44 (05)
  • [32] Static PE Malware Detection Using Gradient Boosting Decision Trees Algorithm
    Huu-Danh Pham
    Tuan Dinh Le
    Thanh Nguyen Vu
    FUTURE DATA AND SECURITY ENGINEERING, FDSE 2018, 2018, 11251 : 228 - 236
  • [33] Efficiency of Gradient Boosting Decision Trees Technique in Polish Companies' Bankruptcy Prediction
    Wyrobek, Joanna
    Kluza, Krzysztof
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2018, PT III, 2019, 854 : 24 - 35
  • [34] Credit scoring based on tree-enhanced gradient boosting decision trees
    Liu, Wanan
    Fan, Hong
    Xia, Meng
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 189
  • [35] Finding Influential Training Samples for Gradient Boosted Decision Trees
    Sharchilev, Boris
    Ustinovsky, Yury
    Serdyukov, Pavel
    de Rijke, Maarten
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [36] Interactive 3D Vase Design Based on Gradient Boosting Decision Trees
    Wang, Dongming
    Xu, Xing
    Xia, Xuewen
    Jia, Heming
    ALGORITHMS, 2024, 17 (09)
  • [37] Root Cause Identification for Road Network Congestion Using the Gradient Boosting Decision Trees
    Chen, Yue
    Li, Changle
    Yue, Wenwei
    Zhang, Hehe
    Mao, Guoqiang
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [38] Acute Kidney Injury Prediction with Gradient Boosting Decision Trees enriched with Temporal Features
    Golovco, Stela
    Mantovani, Matteo
    Combi, Carlo
    Holmes, John H.
    2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 669 - 676
  • [39] Computational Prediction of Critical Temperatures of Superconductors Based on Convolutional Gradient Boosting Decision Trees
    Dan, Yabo
    Dong, Rongzhi
    Cao, Zhuo
    Li, Xiang
    Niu, Chengcheng
    Li, Shaobo
    Hu, Jianjun
    IEEE ACCESS, 2020, 8 (08): : 57868 - 57878
  • [40] Malware Detection Using Gradient Boosting Decision Trees with Customized Log Loss Function
    Gao, Yun
    Hasegawa, Hirokazu
    Yamaguchi, Yukiko
    Shimada, Hajime
    35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 273 - 278