Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

被引:2
|
作者
Wu, Xidong [1 ]
Hu, Zhengmian [1 ]
Pei, Jian [2 ]
Huang, Heng [3 ]
机构
[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15260 USA
[2] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA
[3] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
关键词
AUPRC; federated learning; imbalanced data; stochastic optimization; serverless federated learning;
D O I
10.1145/3580305.3599499
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To address the big data challenges, serverless multi-party collaborative training has recently attracted attention in the data mining community, since they can cut down the communications cost by avoiding the server node bottleneck. However, traditional serverless multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (e.g., cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although multiple single-machine methods have been designed to train models for AUPRC maximization, the algorithm for multi-party collaborative training has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. For example, existing single-machine-based AUPRC maximization algorithms maintain an inner state for local each data point, thus these methods are not applicable to large-scale multi-party collaborative training due to the dependence on each local data point. To address the above challenge, in this paper, we reformulate the serverless multi-party collaborative AUPRC maximization problem as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC. After that, we use the variance reduction technique and propose ServerLess biAsed sTochastic gradiEnt with Momentum-based variance reduction (SLATE-M) algorithm to improve the convergence rate, which matches the best theoretical convergence result reached by the single-machine online method. To the best of our knowledge, this is the first work to solve the multi-party collaborative AUPRC maximization problem. Finally, extensive experiments show the advantages of directly optimizing the AUPRC with distributed learning methods and also verify the efficiency of our new algorithms (i.e., SLATE and SLATE-M).
引用
收藏
页码:2648 / 2659
页数:12
相关论文
共 50 条
  • [41] A hybrid multicast connectivity solution for multi-party collaborative environments
    Namgon Kim
    JongWon Kim
    Thomas D. Uram
    Multimedia Tools and Applications, 2009, 44 : 17 - 37
  • [42] Multi-party, privacy-preserving distributed data mining using a game theoretic framework
    Kargupta, Hillol
    Das, Kamalika
    Liu, Kun
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2007, PROCEEDINGS, 2007, 4702 : 523 - +
  • [43] FedCo: A Federated Learning Controller for Content Management in Multi-party Edge Systems
    Balasubramanian, Venkatraman
    Aloqaily, Moayad
    Reisslein, Martin
    30TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2021), 2021,
  • [44] VFL-R: a novel framework for multi-party in vertical federated learning
    Li, Jialin
    Yan, Tongjiang
    Ren, Pengcheng
    APPLIED INTELLIGENCE, 2023, 53 (10) : 12399 - 12415
  • [45] Multi-party Diabetes Mellitus risk prediction based on secure federated learning
    Su, Yifei
    Huang, Chengwei
    Zhu, Wenwei
    Lyu, Xin
    Ji, Fang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 85
  • [46] VFL-R: a novel framework for multi-party in vertical federated learning
    Jialin Li
    Tongjiang Yan
    Pengcheng Ren
    Applied Intelligence, 2023, 53 : 12399 - 12415
  • [47] Secure Multi-party Data Communications in Cloud Augmented IoT EnvironmentSecure Multi-party Data Communications in Cloud Augmented IoT Environment
    Huang, Xueqing
    Ansari, Nirwan
    2017 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2017,
  • [48] Secure Multi-Party Computation Framework in Decentralized Federated Learning for Histopathology Images
    Hosseini, Seyedeh Maryam
    Babaie, Morteza
    Tizhoosh, Hamid
    LABORATORY INVESTIGATION, 2023, 103 (03) : S1293 - S1294
  • [49] Cluster Based Secure Multi-party Computation in Federated Learning for Histopathology Images
    Hosseini, Seyedeh Maryam
    Sikaroudi, Milad
    Babaei, Morteza
    Tizhoosh, Hamid R.
    DISTRIBUTED, COLLABORATIVE, AND FEDERATED LEARNING, AND AFFORDABLE AI AND HEALTHCARE FOR RESOURCE DIVERSE GLOBAL HEALTH, DECAF 2022, FAIR 2022, 2022, 13573 : 110 - 118
  • [50] A Local Distributed Peer-to-Peer Algorithm Using Multi-Party Optimization Based Privacy Preservation for Data Mining Primitive Computation
    Das, Kamalika
    Kargupta, Hillol
    Bhaduri, Kanishka
    2009 IEEE NINTH INTERNATIONAL CONFERENCE ON PEER-TO-PEER COMPUTING (P2P 2009), 2009, : 212 - +