Tailoring Data Source Distributions for Fairness-aware Data Integration

被引:16
|
作者
Nargesian, Fatemeh [1 ]
Asudeh, Abolfazl [2 ]
Jagadish, H., V [3 ]
机构
[1] Univ Rochester, Rochester, MN 55905 USA
[2] Univ Illinois, Chicago, IL USA
[3] Univ Michigan, Ann Arbor, MI 48109 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2021年 / 14卷 / 11期
基金
美国国家科学基金会;
关键词
D O I
10.14778/3476249.3476299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data scientists often develop data sets for analysis by drawing upon sources of data available to them. A major challenge is to ensure that the data set used for analysis has an appropriate representation of relevant (demographic) groups: it meets desired distribution requirements. Whether data is collected through some experiment or obtained from some data provider, the data from any single source may not meet the desired distribution requirements. Therefore, a union of data from multiple sources is often required. In this paper, we study how to acquire such data in the most cost effective manner, for typical cost functions observed in practice. We present an optimal solution for binary groups when the underlying distributions of data sources are known and all data sources have equal costs. For the generic case with unequal costs, we design an approximation algorithm that performs well in practice. When the underlying distributions are unknown, we develop an exploration-exploitation based strategy with a reward function that captures the cost and approximations of group distributions in each data source. Besides theoretical analysis, we conduct comprehensive experiments that confirm the effectiveness of our algorithms.
引用
收藏
页码:2519 / 2532
页数:14
相关论文
共 50 条
  • [31] FAE: A Fairness-Aware Ensemble Framework
    Iosifidis, Vasileios
    Fetahu, Besnik
    Ntoutsi, Eirini
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 1375 - 1380
  • [32] Fairness-Aware Graph Filter Design
    Kose, O. Deniz
    Shen, Yanning
    Mateos, Gonzalo
    FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 330 - 334
  • [33] Fairness-Aware Structured Pruning in Transformers
    Zayed, Abdelrahman
    Mordido, Goncalo
    Shabanian, Samira
    Baldini, Ioana
    Chandar, Sarath
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22484 - 22492
  • [34] Predictive Policing: A Fairness-aware Approach
    Downey, Ava
    Islam, Sheikh Rabiul
    Sarker, Md Kamruzzman
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2024, 33 (03)
  • [35] FAIROD: Fairness-aware Outlier Detection
    Shekhar, Shubhranshu
    Shah, Neil
    Akoglu, Leman
    AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, : 210 - 220
  • [36] Fairness-Aware Unsupervised Feature Selection
    Xing, Xiaoying
    Liu, Hongfu
    Chen, Chen
    Li, Jundong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3548 - 3552
  • [37] Fairness-aware Graph Attention Networks
    Kose, O. Deniz
    Shen, Yanning
    2022 56TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2022, : 843 - 846
  • [38] Fairness-aware Federated Matrix Factorization
    Liu, Shuchang
    Ge, Yingqiang
    Xu, Shuyuan
    Zhang, Yongfeng
    Marian, Amelie
    PROCEEDINGS OF THE 16TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2022, 2022, : 168 - 178
  • [39] Towards Fairness-Aware Adversarial Learning
    Zhang, Yanghao
    Zhang, Tianle
    Mu, Ronghui
    Huang, Xiaowei
    Ruan, Wenjie
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 24746 - 24755
  • [40] Fairness-aware Maximal Clique Enumeration
    Pan, Minjia
    Li, Rong-Hua
    Zhang, Qi
    Dai, Yongheng
    Tian, Qun
    Wang, Guoren
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 259 - 271