Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

被引:2
|
作者
Wu, Xidong [1 ]
Hu, Zhengmian [1 ]
Pei, Jian [2 ]
Huang, Heng [3 ]
机构
[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15260 USA
[2] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA
[3] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
关键词
AUPRC; federated learning; imbalanced data; stochastic optimization; serverless federated learning;
D O I
10.1145/3580305.3599499
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To address the big data challenges, serverless multi-party collaborative training has recently attracted attention in the data mining community, since they can cut down the communications cost by avoiding the server node bottleneck. However, traditional serverless multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (e.g., cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although multiple single-machine methods have been designed to train models for AUPRC maximization, the algorithm for multi-party collaborative training has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. For example, existing single-machine-based AUPRC maximization algorithms maintain an inner state for local each data point, thus these methods are not applicable to large-scale multi-party collaborative training due to the dependence on each local data point. To address the above challenge, in this paper, we reformulate the serverless multi-party collaborative AUPRC maximization problem as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC. After that, we use the variance reduction technique and propose ServerLess biAsed sTochastic gradiEnt with Momentum-based variance reduction (SLATE-M) algorithm to improve the convergence rate, which matches the best theoretical convergence result reached by the single-machine online method. To the best of our knowledge, this is the first work to solve the multi-party collaborative AUPRC maximization problem. Finally, extensive experiments show the advantages of directly optimizing the AUPRC with distributed learning methods and also verify the efficiency of our new algorithms (i.e., SLATE and SLATE-M).
引用
收藏
页码:2648 / 2659
页数:12
相关论文
共 50 条
  • [1] Multi-party collaborative drug discovery via federated learning
    Huang D.
    Ye X.
    Sakurai T.
    Computers in Biology and Medicine, 2024, 171
  • [2] Privacy sensitive distributed data mining from multi-party data
    Kargupta, H
    Liu, K
    Ryan, J
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2003, 2665 : 336 - 342
  • [3] Secure Multi-party Protocols for Privacy Preserving Data Mining
    Ma, Qingkai
    Deng, Ping
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, 2008, 5258 : 526 - 537
  • [4] Secure multi-party communication in data-mining applications
    MITS, Gwalior, India
    不详
    不详
    Int. J. Database Theory Appl., 4 (299-306):
  • [5] Efficient multi-party privacy preserving data mining for vertically partitioned data
    Sharma, Surbhi
    Shukla, Deepak
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 2, 2016, : 189 - 195
  • [6] GraphFederator: Federated Visual Analysis for Multi-party Graphs
    Han, Dongming
    Zhu, Haiyang
    Chen, Wei
    Pan, Rusheng
    Liu, Yijing
    Zhou, Jiehui
    Feng, Haozhe
    Zhang, Tianye
    Wang, Xumeng
    Zhu, Minfeng
    Tao, Jianrong
    Fan, Changjie
    Zhang, Xiaolong
    2024 IEEE 17TH PACIFIC VISUALIZATION CONFERENCE, PACIFICVIS, 2024, : 172 - 181
  • [7] Partially Encrypted Multi-Party Computation for Federated Learning
    Sotthiwat, Ekanut
    Zhen, Liangli
    Li, Zengxiang
    Zhang, Chi
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 828 - 835
  • [8] Secure Federated Learning for Multi-Party Network Monitoring
    Lytvyn, Oleksandr
    Nguyen, Giang
    IEEE ACCESS, 2024, 12 : 163262 - 163284
  • [9] A blockchain-based collaborative training method for multi-party data sharing
    Yin, Lihua
    Feng, Jiyuan
    Lin, Sixin
    Cao, Zhiqiang
    Sun, Zhe
    COMPUTER COMMUNICATIONS, 2021, 173 : 70 - 78
  • [10] High-performance secure multi-party computation for data mining applications
    Bogdanov, Dan
    Niitsoo, Margus
    Toft, Tomas
    Willemson, Jan
    INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2012, 11 (06) : 403 - 418