TREATS: Fairness-aware entity resolution over streaming data

被引:0
|
作者
Araujo, Tiago Brasileiro [1 ,2 ]
Efthymiou, Vasilis [3 ,4 ]
Christophides, Vassilis [5 ]
Pitoura, Evaggelia [6 ]
Stefanidis, Kostas [1 ]
机构
[1] Tampere Univ, Tampere, Finland
[2] Fed Inst Paraiba, Soledade, Brazil
[3] Harokopio Univ Athens, Athens, Greece
[4] FORTH ICS, Iraklion, Greece
[5] ENSEA, ETIS, Paris, France
[6] Univ Ioannina, Ioannina, Greece
关键词
Entity resolution; Streaming data; Fairness; Incremental processing; Distributed processing; Machine learning;
D O I
10.1016/j.is.2024.102506
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Currently, the growing proliferation of information systems generates large volumes of data continuously, stemming from a variety of sources such as web platforms, social networks, and multiple devices. These data, often lacking a defined schema, require an initial process of consolidation and cleansing before analysis and knowledge extraction can occur. In this context, Entity Resolution (ER) plays a crucial role, facilitating the integration of knowledge bases and identifying similarities among entities from different sources. However, the traditional ER process is computationally expensive, and becomes more complicated in the streaming context where the data arrive continuously. Moreover, there is a lack of studies involving fairness and ER, which is related to the absence of discrimination or bias. In this sense, fairness criteria aim to mitigate the implications of data bias in ER systems, which requires more than just optimizing accuracy, as traditionally done. Considering this context, this work presents TREATS, a schema-agnostic and fairness-aware ER workflow developed for managing streaming data incrementally. The proposed fairness-aware ER framework tackles constraints across various groups of interest, presenting a resilient and equitable solution to the related challenges. Through experimental evaluation, the proposed techniques and heuristics are compared against state-of-the-art approaches over five real-world data source pairs, in which the results demonstrated significant improvements in terms of fairness, without degradation of effectiveness and efficiency measures in the streaming environment. In summary, our contributions aim to propel the ER field forward by providing a workflow that addresses both technical challenges and ethical concerns.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Fairness-aware Federated Matrix Factorization
    Liu, Shuchang
    Ge, Yingqiang
    Xu, Shuyuan
    Zhang, Yongfeng
    Marian, Amelie
    PROCEEDINGS OF THE 16TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2022, 2022, : 168 - 178
  • [32] Towards Fairness-Aware Adversarial Learning
    Zhang, Yanghao
    Zhang, Tianle
    Mu, Ronghui
    Huang, Xiaowei
    Ruan, Wenjie
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 24746 - 24755
  • [33] Fairness-aware Maximal Clique Enumeration
    Pan, Minjia
    Li, Rong-Hua
    Zhang, Qi
    Dai, Yongheng
    Tian, Qun
    Wang, Guoren
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 259 - 271
  • [34] Fairness-aware Methods in Rankings and Recommenders
    Pitoura, Evaggelia
    Stefanidis, Kostas
    Koutrika, Georgia
    2021 22ND IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2021), 2021, : 1 - 4
  • [35] On Convexity and Bounds of Fairness-aware Classification
    Wu, Yongkai
    Zhang, Lu
    Wu, Xintao
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3356 - 3362
  • [36] FairGT: A Fairness-aware Graph Transformer
    Luo, Renqiang
    Huang, Huafei
    Yu, Shuo
    Zhang, Xiuzhen
    Xia, Feng
    arXiv,
  • [37] Fairness-aware recommendation with meta learning
    Oh, Hyeji
    Kim, Chulyun
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [38] Learning Fairness-Aware Relational Structures
    Zhang, Yue
    Ramesh, Arti
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2543 - 2550
  • [39] Collaboration- and Fairness-Aware Big Data Management in Distributed Clouds
    Xia, Qiufen
    Xu, Zichuan
    Liang, Weifa
    Zomaya, Albert Y.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (07) : 1941 - 1953
  • [40] Fairness-aware data offloading of IoT applications enabled by heterogeneous UAVs
    Yan, Hui
    Bao, Weidong
    Zhu, Xiaomin
    Wang, Ji
    Wu, Guanlin
    Cao, Jiang
    INTERNET OF THINGS, 2023, 22