Accelerating a random forest classifier: multi-core, GP-GPU, or FPGA?

被引:102
|
作者
Van Essen, Brian [1 ]
Macaraeg, Chris [1 ]
Gokhale, Maya [1 ]
Prenger, Ryan [1 ]
机构
[1] Lawrence Livermore Natl Lab, Livermore, CA 94550 USA
关键词
FPGA; GP-GPU; OpenMP; Machine learning;
D O I
10.1109/FCCM.2012.47
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Random forest classification is a well known machine learning technique that generates classifiers in the form of an ensemble ("forest") of decision trees. The classification of an input sample is determined by the majority classification by the ensemble. Traditional random forest classifiers can be highly effective, but classification using a random forest is memory bound and not typically suitable for acceleration using FPGAs or GP-GPUs due to the need to traverse large, possibly irregular decision trees. Recent work at Lawrence Livermore National Laboratory has developed several variants of random forest classifiers, including the Compact Random Forest (CRF), that can generate decision trees more suitable for acceleration than traditional decision trees. Our paper compares and contrasts the effectiveness of FPGAs, GP-GPUs, and multi-core CPUs for accelerating classification using models generated by compact random forest machine learning classifiers. Taking advantage of training algorithms that can produce compact random forests composed of many, small trees rather than fewer, deep trees, we are able to regularize the forest such that the classification of any sample takes a deterministic amount of time. This optimization then allows us to execute the classifier in a pipelined or single-instruction multiple thread (SIMT) fashion. We show that FPGAs provide the highest performance solution, but require a multi-chip / multi-board system to execute even modest sized forests. GP-GPUs offer a more flexible solution with reasonably high performance that scales with forest size. Finally, multi-threading via OpenMP on a shared memory system was the simplest solution and provided near linear performance that scaled with core count, but was still significantly slower than the GP-GPU and FPGA.
引用
收藏
页码:232 / 239
页数:8
相关论文
共 50 条
  • [1] Accelerating Random Forest Classification on GPU and FPGA
    Shah, Milan
    Neff, Reece
    Wu, Hancheng
    Minutoli, Marco
    Tumeo, Antonino
    Becchi, Michela
    51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [2] Pseudo-Random Number Generation on GP-GPU
    Passerat-Palmbach, Jonathan
    Mazel, Claude
    Hill, David R. C.
    2011 IEEE WORKSHOP ON PRINCIPLES OF ADVANCED AND DISTRIBUTED SIMULATION (PADS), 2011,
  • [3] Pseudo-random streams for distributed and parallel stochastic simulations on GP-GPU
    Passerat-Palmbach, J.
    Mazel, C.
    Hill, D. R. C.
    JOURNAL OF SIMULATION, 2012, 6 (03) : 141 - 151
  • [4] MDI-GPU: accelerating integrative modelling for genomic-scale data using GP-GPU computing
    Mason, Samuel A.
    Sayyid, Faiz
    Kirk, Paul D. W.
    Starr, Colin
    Wild, David L.
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2016, 15 (01) : 83 - 86
  • [5] Implementation and Performance Analysis of SEAL Encryption on FPGA, GPU and Multi-Core Processors
    Theoharoulis, Kostas
    Antoniadis, Charalambos
    Bellas, Nikolaos
    Antonopoulos, Christos D.
    2011 IEEE 19TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2011, : 65 - 68
  • [6] Accelerating COBAYA3 on multi-core CPU and GPU systems using PARALUTION
    Trost, Nico
    Jimenez, Javier
    Lukarski, Dimitar
    Sanchez, Victor
    ANNALS OF NUCLEAR ENERGY, 2015, 82 : 252 - 259
  • [7] Accelerating COBAYA3 on multi-core CPU and GPU systems using PARALUTION
    Trost, Nico
    Jimenez, Javier
    Lukarski, Dimitar
    Sanchez, Victor
    SNA + MC 2013 - JOINT INTERNATIONAL CONFERENCE ON SUPERCOMPUTING IN NUCLEAR APPLICATIONS + MONTE CARLO, 2014,
  • [8] Multi-Prediction Compression: An Efficient and Scalable Memory Compression Framework for GP-GPU
    Jin, Hoyong
    Jeong, Donghun
    Park, Taewon
    Ko, Jong Hwan
    Kim, Jungrae
    IEEE COMPUTER ARCHITECTURE LETTERS, 2022, 21 (02) : 37 - 40
  • [9] Parallelization of GPU simulator on multi-core platforms
    Zhao, Xia
    Shen, Li
    Liu, Xin
    Wang, Zhi-Ying
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2014, 35 : 219 - 224
  • [10] Accelerating the SCE-UA Global Optimization Method Based on Multi-Core CPU and Many-Core GPU
    Kan, Guangyuan
    Liang, Ke
    Li, Jiren
    Ding, Liuqian
    He, Xiaoyan
    Hu, Youbing
    Amo-Boateng, Mark
    ADVANCES IN METEOROLOGY, 2016, 2016