Towards Privacy Preserving Cross Project Defect Prediction with Federated Learning

被引:6
|
作者
Yamamoto, Hiroki [1 ]
Wang, Dong [1 ]
Rajbahadur, Gopi Krishnan [2 ]
Kondo, Masanari [1 ]
Kamei, Yasutaka [1 ]
Ubayashi, Naoyasu [1 ]
机构
[1] Kyushu Univ, Fukuoka, Japan
[2] Huawei Technol Canada Co Ltd, Markham, ON, Canada
关键词
Defect Prediction; Cross Project; Privacy Preservation; Federated Learning;
D O I
10.1109/SANER56733.2023.00052
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Defect prediction models can predict defects in software projects, and many researchers study defect prediction models to assist debugging efforts in software development. In recent years, there has been growing interest in Cross Project Defect Prediction (CPDP), which predicts defects in a project using a defect prediction model learned from other projects' data when there is insufficient data to construct a defect prediction model. Since CPDP uses other projects' data, data privacy preservation is one of the most significant issues. However, prior CPDP studies still require data sharing among projects to train models, and do not fully consider protecting project confidentiality. To address this, we propose a CPDP model FLR employing federated learning, a distributed machine learning approach that does not require data sharing. We evaluate FLR, using 25 projects, to investigate its effectiveness and feature interpretation. Our key results show that first, FLR outperforms the existing privacy-preserving methods (i.e., LACE2). Meanwhile, the performance is relatively comparable to the conventional methods (e.g., supervised and unsupervised learning). Second, the results of the interpretation analysis show that scale-related features have a common effect on the prediction performance of the FLR. In addition, further insights demonstrate that parameters of federated learning (e.g., learning rates and the number of clients) also play a role in the performance. This study is served as a first step to confirm the feasibility of the employment of federated learning in CPDP to ensure privacy preservation and lays the groundwork for future research on applying other machine learning models to federated learning.
引用
收藏
页码:485 / 496
页数:12
相关论文
共 50 条
  • [1] Better Knowledge Enhancement for Privacy-Preserving Cross-Project Defect Prediction
    Wang, Yuying
    Li, Yichen
    Wang, Haozhao
    Zhao, Lei
    Zhang, Xiaofang
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2025, 37 (01)
  • [2] Towards Efficient and Privacy-preserving Federated Deep Learning
    Hao, Meng
    Li, Hongwei
    Xu, Guowen
    Liu, Sen
    Yang, Haomiao
    ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
  • [3] PVFL: Verifiable federated learning and prediction with privacy-preserving
    Yin, Benxin
    Zhang, Hanlin
    Lin, Jie
    Kong, Fanyu
    Yu, Leyun
    COMPUTERS & SECURITY, 2024, 139
  • [4] Towards Efficient and Privacy-Preserving Federated Learning for HMM Training
    Zheng, Yandong
    Zhu, Hui
    Lu, Rongxing
    Zhang, Songnian
    Guan, Yunguo
    Wang, Fengwei
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 38 - 43
  • [5] Towards robust and privacy-preserving federated learning in edge computing
    Zhou, Hongliang
    Zheng, Yifeng
    Jia, Xiaohua
    COMPUTER NETWORKS, 2024, 243
  • [6] Privacy-Preserving Traffic Flow Prediction: A Federated Learning Approach
    Liu, Yi
    Yu, James J. Q.
    Kang, Jiawen
    Niyato, Dusit
    Zhang, Shuyu
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (08) : 7751 - 7763
  • [7] Preserving Privacy and Security in Federated Learning
    Nguyen, Truc
    Thai, My T.
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (01) : 833 - 843
  • [8] Gestational weight gain prediction using privacy preserving federated learning
    Puri, Chetanya
    Dolui, Koustabh
    Kooijman, Gerben
    Masculo, Felipe
    Van Sambeek, Shannon
    Den Boer, Sebastiaan
    Michiels, Sam
    Hallez, Hans
    Luca, Stijn
    Vanrumste, Bart
    2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 2170 - 2174
  • [9] POI Recommendation with Federated Learning and Privacy Preserving in Cross Domain Recommendation
    Wang, Li-E
    Wang, Yihui
    Bai, Yan
    Liu, Peng
    Li, Xianxian
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (IEEE INFOCOM WKSHPS 2021), 2021,
  • [10] Privacy-Preserving Power Consumption Prediction Based on Federated Learning with Cross-Entity Data
    Liu, Haizhou
    Zhang, Xuan
    Shen, Xinwei
    Sun, Hongbin
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 181 - 186