FDPBoost: Federated differential privacy gradient boosting decision trees

被引:3
|
作者
Li, Yingjie [1 ]
Feng, Yan [1 ]
Qian, Quan [1 ,2 ,3 ,4 ,5 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Mat Genome Inst, Ctr Mat Informat & Data Sci, Shanghai 200444, Peoples R China
[3] Shanghai Univ, Key Lab Silicate Cultural Rel Conservat, Minist Educ, Shanghai, Peoples R China
[4] Zhejiang Lab, Hangzhou 311100, Zhejiang, Peoples R China
[5] Shanghai Frontier Sci Ctr Mechanoinformat, Shanghai 200444, Peoples R China
关键词
Federated learning; Differential privacy; Gradient boosting decision tree; Distributed two-level boosting framework;
D O I
10.1016/j.jisa.2023.103468
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The big data era has led to an exponential increase in data usage, resulting in significantly advancements in data-driven domains and data mining. However, due to privacy and regulatory requirements, sharing data among various institutions is not always possible. Federated learning can help address this problem, but existing studies that combine differential privacy with tree models have shown significant accuracy loss. In this study, we propose a Federated Differential Privacy Gradient Boosting Decision Tree (FDPBoost) that protects the private datasets of different owners while improving model accuracy. We select sensitive features according to the secure feature set indicator, and use an exponential mechanism to protect sensitive features and assign significant weight to the Laplace mechanism to protect leaf node values. Additionally, a distributed two -level boosting framework is designed to allocate the privacy budget between intra-iteration and inter-iteration decision trees while protecting model communication. The FDPBoost is tested on five datasets sourced from the materials and medical domains. Our experiments reveal that FDPBoost achieves competitive accuracy with traditional federated gradient boosting decision trees while also exhibiting a significant reduction in error rate as compared to PPGBDT (Zhao et al.) and FV-tree (Gao et al.). Notably, FDPBoost's error rate on the tumor-diagnosis dataset is 30% lower than that of FV-tree.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Establishment of a differential diagnosis method and an online prediction platform for AOSD and sepsis based on gradient boosting decision trees algorithm
    Zhou, Dongmei
    Xie, Jingzhi
    Wang, Jiarui
    Zong, Juan
    Fang, Quanquan
    Luo, Fei
    Zhang, Ting
    Ma, Hua
    Cao, Lina
    Yin, Hanqiu
    Yin, Songlou
    Li, Shuyan
    ARTHRITIS RESEARCH & THERAPY, 2023, 25 (01)
  • [32] Boosting Privately: Federated Extreme Gradient Boosting for Mobile Crowdsensing
    Liu, Yang
    Ma, Zhuo
    Liu, Ximeng
    Ma, Siqi
    Nepal, Surya
    Deng, Robert H.
    Ren, Kui
    2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2020, : 1 - 11
  • [33] Effects of Driving Behavior on Fuel Consumption with Explainable Gradient Boosting Decision Trees
    Konstantinou, Christos
    Fafoutellis, Panagiotis
    Mantouka, Eleni G.
    Chalkiadakis, Charis
    Fortsakis, Petro S.
    Vlahogianni, Eleni I.
    2023 8TH INTERNATIONAL CONFERENCE ON MODELS AND TECHNOLOGIES FOR INTELLIGENT TRANSPORTATION SYSTEMS, MT-ITS, 2023,
  • [34] Prediction of Mean Wave Overtopping Discharge Using Gradient Boosting Decision Trees
    den Bieman, Joost P.
    Wilms, Josefine M.
    van den Boogaard, Henk F. P.
    van Gent, Marcel R. A.
    WATER, 2020, 12 (06)
  • [35] Gradient boosting decision trees to study laboratory and field performance in pavement management
    Berangi, Mohammadjavad
    Lontra, Bernardo Mota
    Anupam, Kumar
    Erkens, Sandra
    Van Vliet, Dave
    Snippe, Almar
    Moenielal, Mahesh
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2025, 40 (01) : 3 - 32
  • [36] Ensembling Learning Based Melanoma Classification Using Gradient Boosting Decision Trees
    Han, Yipeng
    Zheng, Xiaolu
    AIPR 2020: 2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION, 2020, : 104 - 109
  • [37] A mobile recommendation system based on Logistic Regression and Gradient Boosting Decision Trees
    Wang, Yaozheng
    Feng, Dawei
    Ii, Dongsheng
    Chen, Xinyuan
    Zhac, Yunxiang
    Niu, Xin
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1896 - 1902
  • [38] Explainable Steel Quality Prediction System Based on Gradient Boosting Decision Trees
    Takalo-Mattila, Janne
    Heiskanen, Mikko
    Kyllonen, Vesa
    Maatta, Leena
    Bogdanoff, Agne
    IEEE ACCESS, 2022, 10 : 68099 - 68110
  • [39] Retrieval-Based Gradient Boosting Decision Trees for Disease Risk Assessment
    Ma, Handong
    Cao, Jiahang
    Fang, Yuchen
    Zhang, Weinan
    Sheng, Wenbo
    Zhang, Shaodian
    Yu, Yong
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 3468 - 3476
  • [40] Fast Gradient Boosting Decision Trees with Bit-Level Data Structures
    Devos, Laurens
    Meert, Wannes
    Davis, Jesse
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 11906 : 590 - 606