A privacy protection technique for publishing data mining models and research data

被引:1
|
作者
Fu Y. [1 ]
Chen Z. [1 ]
Koru G. [1 ]
Gangopadhyay A. [1 ]
机构
[1] Department of Information Systems, University of Maryland, Baltimore County, Baltimore, MD 21250
关键词
Preserving data mining;
D O I
10.1145/1877725.1877732
中图分类号
学科分类号
摘要
Data mining techniques have been widely used in many research disciplines such as medicine, life sciences, and social sciences to extract useful knowledge (such as mining models) from research data. Research data often needs to be published along with the data mining model for verification or reanalysis. However, the privacy of the published data needs to be protected because otherwise the published data is subject to misuse such as linking attacks. Therefore, employing various privacy protection methods becomes necessary. However, these methods only consider privacy protection and do not guarantee that the same mining models can be built from sanitized data. Thus the published models cannot be verified using the sanitized data. This article proposes a technique that not only protects privacy, but also guarantees that the same model, in the form of decision trees or regression trees, can be built from the sanitized data. We have also experimentally shown that other mining techniques can be used to reanalyze the sanitized data. This technique can be used to promote sharing of research data. © 2010 ACM.
引用
收藏
相关论文
共 50 条
  • [1] Research on Privacy Protection Technology for Data Publishing
    Qu, Lianwei
    Yang, Jing
    Yan, Xueyun
    Ma, Lixin
    Yang, Qixuan
    Han, Yaxin
    2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 999 - 1005
  • [2] Privacy protection data publishing method for data privacy differences
    Yu Y.
    Zhou D.
    Li H.
    Wu X.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2020, 48 (09): : 57 - 63
  • [3] Privacy Protection in Data Mining
    Fu, Chunchang
    Zhang, Nan
    2010 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING (MSE 2010), VOL 2, 2010, : 92 - 93
  • [4] Protection or privacy? Data mining and personal data
    Hand, DJ
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 1 - 10
  • [5] DATA MINING AS A TOOL IN PRIVACY-PRESERVING DATA PUBLISHING
    Sramka, Michal
    NILCRYPT 10, 2010, 45 : 151 - 159
  • [6] Privacy as a Service: Publishing Data and Models
    Dandekar, Ashish
    Basu, Debabrota
    Kister, Thomas
    Sen Poh, Geong
    Xu, Jia
    Bressan, Stephane
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 557 - 561
  • [7] Application Research of Data Mining Technology in Personal Privacy Protection and Material Data Analysis
    Liu, Jianguo
    Zhou, Sheng
    INTEGRATED FERROELECTRICS, 2021, 216 (01) : 29 - 42
  • [8] Privacy-Preserving Data Publishing in Process Mining
    Rafiei, Majid
    van der Aalst, Wil M. P.
    BUSINESS PROCESS MANAGEMENT FORUM, BPM FORUM 2020, 2020, 392 : 122 - 138
  • [9] Privacy protection in data mining: A perturbation approach for categorical data
    Li, Xiao-Bai
    Sarkar, Sumit
    INFORMATION SYSTEMS RESEARCH, 2006, 17 (03) : 254 - 270
  • [10] Research on Privacy Preserving Data Mining
    Wang, Pingshui
    Wang, Jiandong
    Zhu, Xinfeng
    Jiang, Jian
    2010 INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT (CCCM2010), VOL I, 2010, : 172 - 175