A fair-multicluster approach to clustering of categorical data

被引:0
|
作者
Carlos Santos-Mangudo
Antonio J. Heras
机构
[1] Complutense University of Madrid,Financial and Actuarial Economics and Statistics Department
关键词
Clustering; Fairness; Fair clustering; Categorical data;
D O I
暂无
中图分类号
学科分类号
摘要
In the last few years, the need of preventing classification biases due to race, gender, social status, etc. has increased the interest in designing fair clustering algorithms. The main idea is to ensure that the output of a cluster algorithm is not biased towards or against specific subgroups of the population. There is a growing specialized literature on this topic, dealing with the problem of clustering numerical data bases. Nevertheless, to our knowledge, there are no previous papers devoted to the problem of fair clustering of pure categorical attributes. In this paper, we show that the Multicluster methodology proposed by Santos and Heras (Interdiscip J Inf Knowl Manag 15:227–246, 2020. https://doi.org/10.28945/4643) for clustering categorical data, can be modified in order to increase the fairness of the clusters. Of course, there is a trade-off between fairness and efficiency, so that an increase in the fairness objective usually leads to a loss of classification efficiency. Yet it is possible to reach a reasonable compromise between these goals, since the methodology proposed by Santos and Heras (2020) can be easily adapted in order to get homogeneous and fair clusters.
引用
收藏
页码:583 / 604
页数:21
相关论文
共 50 条
  • [21] Mixture of Networks for Clustering Categorical Data: A Penalized Composite Likelihood Approach
    Baek, Jangsun
    Park, Jeong-Soo
    AMERICAN STATISTICIAN, 2023, 77 (03): : 259 - 273
  • [22] A data labeling method for clustering categorical data
    Cao, Fuyuan
    Liang, Jiye
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (03) : 2381 - 2385
  • [23] Data Reduction Method for Categorical Data Clustering
    Rendon, Erendira
    Salvador Sanchez, J.
    Garcia, Rene A.
    Abundez, Itzel
    Gutierrez, Citlalih
    Gasca, Eduardo
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2008, PROCEEDINGS, 2008, 5290 : 143 - +
  • [24] Incremental Clustering for Categorical Data Using Clustering Ensemble
    Li Taoying
    Chne Yan
    Qu Lili
    Mu Xiangwei
    PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, : 2519 - 2524
  • [25] Ordering of categorical data in hierarchical clustering
    Kazimianec, Michail
    DATABASES AND INFORMATION SYSTEMS, 2008, : 401 - 404
  • [26] A Clustering Method for Categorical Ordinal Data
    Giordan, Marco
    Diana, Giancarlo
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2011, 40 (07) : 1315 - 1334
  • [27] Formulations of fuzzy clustering for categorical data
    Umayahara, Kazutaka
    Miyamoto, Sadaaki
    Nakamori, Yoshiteru
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2005, 1 (01): : 83 - 94
  • [28] HABOS clustering algorithm for categorical data
    Wu, Sen (wusen@manage.ustb.edu.cn), 2016, Science Press (38):
  • [29] Space Structure and Clustering of Categorical Data
    Qian, Yuhua
    Li, Feijiang
    Liang, Jiye
    Liu, Bing
    Dang, Chuangyin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (10) : 2047 - 2059
  • [30] Conceptual clustering categorical data with uncertainty
    Xia, Yuni
    Xi, Bowei
    19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL I, PROCEEDINGS, 2007, : 329 - +