Protecting Machine Learning Models from Training Data Set Extraction

被引：0

作者：

Kalinin, M. O. ^{[1
]}

Muryleva, A. A. ^{[1
]}

Platonov, V. V. ^{[1
]}

机构：

[1] Peter Great St Petersburg Polytech Univ, St Petersburg 195251, Russia

来源：

AUTOMATIC CONTROL AND COMPUTER SCIENCES | 2024年 / 58卷 / 08期

关键词：

noising; machine learning; training set; membership inference; Gaussian noise; PRIVACY;

D O I：

10.3103/S0146411624700871

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The problem of protecting machine learning models from the threat of data privacy violation implementing membership inference in training data sets is considered. A method of protective noising of the training set is proposed. It is experimentally shown that Gaussian noising of training data with a scale of 0.2 is the simplest and most effective way to protect machine learning models from membership inference in the training set. In comparison with alternatives, this method is easy to implement, universal in relation to types of models, and allows reducing the effectiveness of membership inference to 26 percentage points.

引用

页码：1234 / 1241

页数：8

共 50 条

[1] Autonomous data extraction from peer reviewed literature for training machine learning models of oxidation potentials
Lee, Siwoo
Heinen, Stefan
Khan, Danish
von Lilienfeld, O. Anatole
MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2024, 5 (01):
[2] Uncertain Training Data Set Conceptual Reduction: A Machine Learning Perspective
Rezk, Eman
Babi, Syrinne
Islam, Fahad
Jaoua, Ali
2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 1842 - 1849
[3] SVM learning from large training data set
Murphey, YL
Chen, ZH
Putrus, M
Feldkamp, L
PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2860 - 2865
[4] Training Set Optimization with Uncertainty Quantification for Machine Learning Models of Electromagnetic Structures
Guo, Yiliang
Bhatti, Osama Waqar
Swaminathan, Madhavan
2022 IEEE ELECTRICAL DESIGN OF ADVANCED PACKAGING AND SYSTEMS (EDAPS), 2022,
[5] Data set and machine learning models for the classification of network traffic originators
Canavese, Daniele
Regano, Leonardo
Basile, Cataldo
Ciravegna, Gabriele
Lioy, Antonio
DATA IN BRIEF, 2022, 41
[6] Protecting Data from Malware Threats using Machine Learning Technique
Chowdhury, Mozammel
Rahman, Azizur
Islam, Rafiqul
PROCEEDINGS OF THE 2017 12TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2017, : 1691 - 1694
[7] kScore: a novel machine learning approach that is not dependent on the data structure of the training set
Oloff, Scott
Muegge, Ingo
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2007, 21 (1-3) : 87 - 95
[8] kScore: a novel machine learning approach that is not dependent on the data structure of the training set
Scott Oloff
Ingo Muegge
Journal of Computer-Aided Molecular Design, 2007, 21 : 87 - 95
[9] Learning models from data: the set membership approach
Milanese, M
PROCEEDINGS OF THE 1998 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 1998, : 178 - 182
[10] Machine Translation of a Training Set for Semantic Extraction of Relations
Pena-Torres, Jefferson A.
Bucheli, Victor
Gutierrez De Pinerez Reyes, Raul E.
CUADERNOS DE LINGUISTICA HISPANICA, 2022, 39

← 1 2 3 4 5 →