Feature selection for classification under anonymity constraint

被引:3
|
作者
Zhang, Baichuan [1 ]
Mohammed, Noman [2 ]
Dave, Vachik S. [1 ]
Al Hasan, Mohammad [1 ]
机构
[1] Department of Computer and Information Science, Indiana University Purdue University, Indianapolis,IN,46202, United States
[2] Department of Computer Science, Manitoba University, Canada
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Over the last decade, proliferation of various online platforms and their increasing adoption by billions of users have heightened the privacy risk of a user enormously. In fact, security researchers have shown that sparse microdata containing information about online activities of a user although anonymous, can still be used to disclose the identity of the user by cross-referencing the data with other data sources. To preserve the privacy of a user, in existing works several methods (k-anonymity, -diversity, differential privacy) are proposed for ensuring that a dataset bears small identity disclosure risk. However, the majority of these methods modify the data in isolation, without considering their utility in subsequent knowledge discovery tasks, which makes these datasets less informative. In this work, we consider labeled data that are generally used for classification, and propose two methods for feature selection considering two goals: first, on the reduced feature set the data has small disclosure risk, and second, the utility of the data is preserved for performing a classification task. Experimental results on various real-world datasets show that the method is effective and useful in practice. © 2017, University of Skovde. All rights reserved.
引用
收藏
页码:1 / 25
相关论文
共 50 条
  • [1] Feature Selection Under a Complexity Constraint
    Plasberg, Jan H.
    Kleijn, W. Bastiaan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2009, 11 (03) : 565 - 571
  • [2] Effective Evolutionary Multilabel Feature Selection under a Budget Constraint
    Lee, Jaesung
    Seo, Wangduk
    Kim, Dae-Won
    COMPLEXITY, 2018,
  • [3] Weighted Constraint Feature Selection of Local Descriptor for Texture Image Classification
    Gemeay, Entessar Saeed
    Alenizi, Farhan A.
    Mohammed, Adil Hussein
    Shakoor, Mohammad Hossein
    Boostani, Reza
    IEEE ACCESS, 2023, 11 : 91673 - 91695
  • [4] Feature Selection with Cost Constraint
    Liu, Xiaoping
    Li, Xiao-Bai
    AMCIS 2017 PROCEEDINGS, 2017,
  • [5] FEATURE SELECTION WITH FUZZY CONSTRAINT
    DIDAY, E
    COMPTES RENDUS HEBDOMADAIRES DES SEANCES DE L ACADEMIE DES SCIENCES SERIE A, 1975, 281 (21): : 925 - 927
  • [6] Feature selection for classification
    Department of Information Systems and Computer Science, National University of Singapore, Singapore 119260, Singapore
    Intell. Data Anal., 3 (131-156):
  • [7] Feature selection with time cost constraint
    Ding, H. (doceanh@163.com), 1600, Binary Information Press, Flat F 8th Floor, Block 3, Tanner Garden, 18 Tanner Road, Hong Kong (11):
  • [8] Feature selection with test cost constraint
    Min, F. (minfanphd@163.com), 1600, Elsevier Inc. (55):
  • [9] Feature selection with test cost constraint
    Min, Fan
    Hu, Qinghua
    Zhu, William
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2014, 55 (01) : 167 - 179
  • [10] Semi-supervised feature selection for audio classification based on constraint compensated Laplacian score
    Yang, Xu-Kui
    He, Liang
    Qu, Dan
    Zhang, Wei-Qiang
    Johnson, Michael T.
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2016, : 1 - 10