Protecting Machine Learning Models from Training Data Set Extraction

被引:0
|
作者
Kalinin, M. O. [1 ]
Muryleva, A. A. [1 ]
Platonov, V. V. [1 ]
机构
[1] Peter Great St Petersburg Polytech Univ, St Petersburg 195251, Russia
关键词
noising; machine learning; training set; membership inference; Gaussian noise; PRIVACY;
D O I
10.3103/S0146411624700871
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of protecting machine learning models from the threat of data privacy violation implementing membership inference in training data sets is considered. A method of protective noising of the training set is proposed. It is experimentally shown that Gaussian noising of training data with a scale of 0.2 is the simplest and most effective way to protect machine learning models from membership inference in the training set. In comparison with alternatives, this method is easy to implement, universal in relation to types of models, and allows reducing the effectiveness of membership inference to 26 percentage points.
引用
收藏
页码:1234 / 1241
页数:8
相关论文
共 50 条
  • [1] Autonomous data extraction from peer reviewed literature for training machine learning models of oxidation potentials
    Lee, Siwoo
    Heinen, Stefan
    Khan, Danish
    von Lilienfeld, O. Anatole
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):
  • [2] Uncertain Training Data Set Conceptual Reduction: A Machine Learning Perspective
    Rezk, Eman
    Babi, Syrinne
    Islam, Fahad
    Jaoua, Ali
    2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 1842 - 1849
  • [3] SVM learning from large training data set
    Murphey, YL
    Chen, ZH
    Putrus, M
    Feldkamp, L
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2860 - 2865
  • [4] Training Set Optimization with Uncertainty Quantification for Machine Learning Models of Electromagnetic Structures
    Guo, Yiliang
    Bhatti, Osama Waqar
    Swaminathan, Madhavan
    2022 IEEE ELECTRICAL DESIGN OF ADVANCED PACKAGING AND SYSTEMS (EDAPS), 2022,
  • [5] Data set and machine learning models for the classification of network traffic originators
    Canavese, Daniele
    Regano, Leonardo
    Basile, Cataldo
    Ciravegna, Gabriele
    Lioy, Antonio
    DATA IN BRIEF, 2022, 41
  • [6] Protecting Data from Malware Threats using Machine Learning Technique
    Chowdhury, Mozammel
    Rahman, Azizur
    Islam, Rafiqul
    PROCEEDINGS OF THE 2017 12TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2017, : 1691 - 1694
  • [7] kScore: a novel machine learning approach that is not dependent on the data structure of the training set
    Oloff, Scott
    Muegge, Ingo
    JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2007, 21 (1-3) : 87 - 95
  • [8] kScore: a novel machine learning approach that is not dependent on the data structure of the training set
    Scott Oloff
    Ingo Muegge
    Journal of Computer-Aided Molecular Design, 2007, 21 : 87 - 95
  • [9] Learning models from data: the set membership approach
    Milanese, M
    PROCEEDINGS OF THE 1998 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 1998, : 178 - 182
  • [10] Machine Translation of a Training Set for Semantic Extraction of Relations
    Pena-Torres, Jefferson A.
    Bucheli, Victor
    Gutierrez De Pinerez Reyes, Raul E.
    CUADERNOS DE LINGUISTICA HISPANICA, 2022, 39