An efficient perturbation approach for multivariate data in sensitive and reliable data mining

被引:6
|
作者
Paul, Mahit Kumar [1 ]
Islam, Md Rabiul [1 ]
Sattar, A. H. M. Sarowar [1 ]
机构
[1] Rajshahi Univ Engn & Technol, Dept Comp Sci & Engn, Rajshahi 6204, Bangladesh
关键词
Privacy preserving data mining; Data perturbation; Data privacy; Information privacy; Privacy; Data utility; OF-THE-ART; PRIVACY; INFORMATION;
D O I
10.1016/j.jisa.2021.102954
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the rapid enhancement of technology, cloud data is increasing rapidly which contains individuals' sensitive information such as medical diagnostics reports. While extracting knowledge from those sensitive data, both privacy of individuals' and the utility of data should be preserved which is a crucial concern in data mining related activities. Though therein exist several methods to preserve privacy, a single method can not maintain the harmony interim privacy and data utility. Often achieving individuals' privacy leads to the loss of the data utility and the opposite is true also. To address the vital issue, a four-stage data perturbation approach, called NRoReM, is proposed in this work based on normalization, geometric rotation, linear regression, and scalar multiplication for sensitive data mining. The proposed approach is experimented with over ten UCI data set using three benchmark classifiers. The empirical exploration of privacy protection, attack resistance, information entropy analysis, data utility, and error analysis exhibits that NRoReM preserves both privacy of individuals' and data utility on a larger scale for 90% of the data set than 3-Dimensional Rotation Transformation (3DRT) and 2-Dimensional Rotation Transformation (2DRT).
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Scalable, Reliable and Robust Data Mining Infrastructures
    Pawar, Shrikant
    Stanam, Aditya
    PROCEEDINGS OF THE 2020 FOURTH WORLD CONFERENCE ON SMART TRENDS IN SYSTEMS, SECURITY AND SUSTAINABILITY (WORLDS4 2020), 2020, : 123 - 125
  • [42] An enhanced data perturbation approach for small data sets
    Muralidhar, K
    Sarathy, R
    DECISION SCIENCES, 2005, 36 (03) : 513 - 529
  • [43] DATA MINING APPROACHES TO MULTIVARIATE BIOMARKER DISCOVERY
    Dziuda, Darius M.
    SOME RECENT ADVANCES IN MATHEMATICS & STATISTICS, 2013, : 100 - 109
  • [44] Temporal data mining for multivariate time series
    Guimaraes, G
    IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 1379 - 1385
  • [45] Mining Patterns of Sensitive Data Usage
    Avdiienko, Vitalii
    2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 2, 2015, : 891 - 894
  • [46] Biclustering Multivariate Data for Correlated Subspace Mining
    Watanabe, Kazuho
    Wu, Hsiang-Yun
    Niibe, Yusuke
    Takahashi, Shigeo
    Fujishiro, Issei
    2015 IEEE PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS), 2015, : 287 - 294
  • [47] Cost Sensitive Classification in Data Mining
    Qin, Zhenxing
    Zhang, Chengqi
    Wang, Tao
    Zhang, Shichao
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2010, PT I, 2010, 6440 : 1 - 11
  • [48] An Efficient Approach for Mining Reliable High Utility Patterns
    Fouad, Mohammed A.
    Hussein, Wedad
    Rady, Sherine
    Yu, Philip S.
    Gharib, Tarek F.
    IEEE ACCESS, 2022, 10 : 1419 - 1431
  • [49] Analysis of cancer data: a data mining approach
    Delen, Dursun
    EXPERT SYSTEMS, 2009, 26 (01) : 100 - 112
  • [50] Clustering for data mining: A data recovery approach
    Leslie Rutkowski
    Psychometrika, 2007, 72 : 109 - 110