Feature selection for classification under anonymity constraint

被引:3
|
作者
Zhang, Baichuan [1 ]
Mohammed, Noman [2 ]
Dave, Vachik S. [1 ]
Al Hasan, Mohammad [1 ]
机构
[1] Department of Computer and Information Science, Indiana University Purdue University, Indianapolis,IN,46202, United States
[2] Department of Computer Science, Manitoba University, Canada
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Over the last decade, proliferation of various online platforms and their increasing adoption by billions of users have heightened the privacy risk of a user enormously. In fact, security researchers have shown that sparse microdata containing information about online activities of a user although anonymous, can still be used to disclose the identity of the user by cross-referencing the data with other data sources. To preserve the privacy of a user, in existing works several methods (k-anonymity, -diversity, differential privacy) are proposed for ensuring that a dataset bears small identity disclosure risk. However, the majority of these methods modify the data in isolation, without considering their utility in subsequent knowledge discovery tasks, which makes these datasets less informative. In this work, we consider labeled data that are generally used for classification, and propose two methods for feature selection considering two goals: first, on the reduced feature set the data has small disclosure risk, and second, the utility of the data is preserved for performing a classification task. Experimental results on various real-world datasets show that the method is effective and useful in practice. © 2017, University of Skovde. All rights reserved.
引用
收藏
页码:1 / 25
相关论文
共 50 条
  • [21] Feature Selection for Twitter Classification
    Ostrowski, David Alfred
    2014 IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2014, : 267 - 272
  • [22] Feature Selection for Monotonic Classification
    Hu, Qinghua
    Pan, Weiwei
    Zhang, Lei
    Zhang, David
    Song, Yanping
    Guo, Maozu
    Yu, Daren
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2012, 20 (01) : 69 - 81
  • [23] Feature Selection for Gender Classification
    Zhang, Zhihong
    Hancock, Edwin R.
    PATTERN RECOGNITION AND IMAGE ANALYSIS: 5TH IBERIAN CONFERENCE, IBPRIA 2011, 2011, 6669 : 76 - 83
  • [24] Sequential Feature Selection for Classification
    Rueckstiess, Thomas
    Osendorfer, Christian
    van der Smagt, Patrick
    AI 2011: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 7106 : 132 - +
  • [25] Feature Selection in Text Classification
    Sahin, Durmus Ozkan
    Ates, Nurullah
    Kilic, Erdal
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1777 - 1780
  • [26] Unsupervised Feature Selection via Feature-Grouping and Orthogonal Constraint
    Yuan, Aihong
    Huang, Jiahao
    Wei, Chen
    Zhang, Wenjie
    Zhang, Naidan
    You, Mengbo
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 720 - 726
  • [27] On the effectiveness of feature selection methods for gait classification under different covariate factors
    Yeoh, Tze Wei
    Daolio, Fabio
    Aguirre, Hernan E.
    Tanaka, Kiyoshi
    APPLIED SOFT COMPUTING, 2017, 61 : 42 - 57
  • [28] Feature selection gait-based gender classification under different circumstances
    Sabir, Azhin
    Al-jawad, Naseer
    Jassim, Sabah
    REAL-TIME IMAGE AND VIDEO PROCESSING 2014, 2014, 9139
  • [29] Semantic video classification and feature subset selection under context and concept uncertainty
    Fan, JP
    Luo, H
    Xiao, J
    Wu, L
    JCDL 2004: PROCEEDINGS OF THE FOURTH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES: GLOBAL REACH AND DIVERSE IMPACT, 2004, : 192 - 201
  • [30] FEATURE SELECTION BASED ON COMPLEMENTARITY OF FEATURE CLASSIFICATION CAPABILITY
    Gao, Fei
    Yu, Tian
    Wei, Yang
    Jin, Han
    Wei, Jin-Mao
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 130 - 135