Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

被引:2
|
作者
Wu, Xidong [1 ]
Hu, Zhengmian [1 ]
Pei, Jian [2 ]
Huang, Heng [3 ]
机构
[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15260 USA
[2] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA
[3] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
关键词
AUPRC; federated learning; imbalanced data; stochastic optimization; serverless federated learning;
D O I
10.1145/3580305.3599499
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To address the big data challenges, serverless multi-party collaborative training has recently attracted attention in the data mining community, since they can cut down the communications cost by avoiding the server node bottleneck. However, traditional serverless multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (e.g., cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although multiple single-machine methods have been designed to train models for AUPRC maximization, the algorithm for multi-party collaborative training has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. For example, existing single-machine-based AUPRC maximization algorithms maintain an inner state for local each data point, thus these methods are not applicable to large-scale multi-party collaborative training due to the dependence on each local data point. To address the above challenge, in this paper, we reformulate the serverless multi-party collaborative AUPRC maximization problem as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC. After that, we use the variance reduction technique and propose ServerLess biAsed sTochastic gradiEnt with Momentum-based variance reduction (SLATE-M) algorithm to improve the convergence rate, which matches the best theoretical convergence result reached by the single-machine online method. To the best of our knowledge, this is the first work to solve the multi-party collaborative AUPRC maximization problem. Finally, extensive experiments show the advantages of directly optimizing the AUPRC with distributed learning methods and also verify the efficiency of our new algorithms (i.e., SLATE and SLATE-M).
引用
收藏
页码:2648 / 2659
页数:12
相关论文
共 50 条
  • [31] Extended multicast connectivity, solution for multi-party collaborative environments
    Kim, Namgon
    Kim, JongWon
    2007 4TH IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE, VOLS 1-3, 2007, : 676 - 680
  • [32] A federated learning system with data fusion for healthcare using multi-party computation and additive secret sharing
    Muazu, Tasiu
    Yingchi, Mao
    Muhammad, Abdullahi Uwaisu
    Ibrahim, Muhammad
    Kumshe, Umar Muhammad Mustapha
    Samuel, Omaji
    COMPUTER COMMUNICATIONS, 2024, 216 : 168 - 182
  • [33] Towards Collaborative Query Planning in Multi-party Database Networks
    Zhao, Mingyi
    Liu, Peng
    Lobo, Jorge
    DATA AND APPLICATIONS SECURITY AND PRIVACY XXIX, 2015, 9149 : 19 - 34
  • [34] Learning By Collaborative Teaching : An Engaging Multi-Party CoWriter Activity
    El Hamamsy, Laila
    Johal, Wafa
    Asselborn, Thibault
    Nasir, Jauwairia
    Dillenbourg, Pierre
    2019 28TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2019,
  • [35] Data Federation System for Multi-party Security
    Li S.-Y.
    Ji Y.-D.
    Shi D.-Y.
    Liao W.-D.
    Zhang L.-P.
    Tong Y.-X.
    Xu K.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (03): : 1111 - 1127
  • [36] Data Anonymity in Multi-Party Service Model
    Kiyomoto, Shinsaku
    Fukushima, Kazuhide
    Miyake, Yutaka
    SECURITY TECHNOLOGY, 2011, 259 : 21 - 30
  • [37] A hybrid multicast connectivity solution for multi-party collaborative environments
    Kim, Namgon
    Kim, JongWon
    Uram, Thomas D.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2009, 44 (01) : 17 - 37
  • [38] Semi-trusted Collaborative Framework for Multi-party Computation
    Wong, Kok Seng
    Kim, Myung Ho
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2010, 4 (03): : 411 - 427
  • [39] Data privacy protection in multi-party clustering
    Yang, Weijia
    Huang, Shangteng
    DATA & KNOWLEDGE ENGINEERING, 2008, 67 (01) : 185 - 199
  • [40] Preserving Privacy in Collaborative Systems with Secure Multi-Party Summation
    Chang, Xin
    Kong, Wenhui
    Wang, Xingjun
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 3066 - 3071