Quality Assessment and Biases in Reused Data

被引:3
|
作者
Fernandez-Ardevo, Mireia [1 ,2 ]
Rosales, Andrea [1 ,2 ]
机构
[1] Univ Oberta Catalunya UOC, Fac Informat & Commun Sci, Barcelona, Catalonia, Spain
[2] Univ Oberta Catalunya UOC, IN3 Internet Interdisciplinary Inst, Barcelona, Catalonia, Spain
关键词
data quality; data biases; reused data; reused traces; open data; online behavioral advertising;
D O I
10.1177/00027642221144855
中图分类号
B849 [应用心理学];
学科分类号
040203 ;
摘要
This article investigates digital and non-digital traces reused beyond the context of creation. A central idea of this article is that no (reused) dataset is perfect. Therefore, data quality assessment becomes essential to determine if a given dataset is "good enough" to be used to fulfill the users' goals. Biases, a possible source of discrimination, have become a relevant data challenge. Consequently, it is appropriate to analyze whether quality assessment indicators provide information on potential biases in the dataset. We use examples representing two opposing sides regarding data access to reflect on the relationship between quality and bias. First, the European Union open data portal fosters the democratization of data and expects users to manipulate the databases directly to perform their analyses. Second, online behavioral advertising systems offer individualized promotional services but do not share the datasets supporting their design. Quality assessment is socially constructed, as there is not a universal definition but a set of quality dimensions, which might change for each professional context. From the users' perspective, trust/credibility stands out as a relevant quality dimension in the two analyzed cases. Results show that quality indicators (whatever they are) provide limited information on potential biases. We suggest that data literacy is most needed among both open data users and clients of behavioral advertising systems. Notably, users must (be able to) understand the limitations of datasets for an optimal and bias-free interpretation of results and decision-making.
引用
收藏
页码:696 / 710
页数:15
相关论文
共 50 条
  • [31] Statistical data analysis in quality assessment
    Ohsumi, N
    JOURNAL OF THE FOOD HYGIENIC SOCIETY OF JAPAN, 1998, 39 (05): : J384 - J389
  • [32] Provenance management for data quality assessment
    Zheng, Hua
    Zhu, Qinghua
    Wu, Kewen
    Journal of Software, 2012, 7 (08) : 1905 - 1910
  • [33] Linked Data Quality Assessment: A Survey
    Nayak, Aparna
    Bozic, Bojan
    Longo, Luca
    WEB SERVICES - ICWS 2021, 2022, 12994 : 63 - 76
  • [34] Crowdsourcing Linked Data Quality Assessment
    Acosta, Maribel
    Zaveri, Amrapali
    Simperl, Elena
    Kontokostas, Dimitris
    Auer, Soeren
    Lehmann, Jens
    SEMANTIC WEB - ISWC 2013, PART II, 2013, 8219 : 260 - 276
  • [35] An assessment of data quality for structural identification
    Wadia-Fascetti, S
    Rivero, F
    Sanayei, M
    APPLICATIONS OF STATISTICS AND PROBABILITY, VOLS 1 AND 2: CIVIL ENGINEERING RELIABILITY AND RISK ANALYSIS, 2000, : 813 - 820
  • [36] Quality Assessment of Trade Data in Malaysia
    Ali, Dhakir Abbas
    Johari, Fuadah
    Alias, Mohammad Haji
    MALAYSIAN JOURNAL OF ECONOMIC STUDIES, 2019, 56 (01) : 23 - 42
  • [37] Methodologies for Data Quality Assessment and Improvement
    Batini, Carlo
    Cappiello, Cinzia
    Francalanci, Chiara
    Maurino, Andrea
    ACM COMPUTING SURVEYS, 2009, 41 (03)
  • [38] Statistical data analysis in quality assessment
    Ohsumi, N
    JOURNAL OF THE FOOD HYGIENIC SOCIETY OF JAPAN, 2000, 41 (03): : J238 - J242
  • [39] DMN for Data Quality Measurement and Assessment
    Valencia-Parra, Alvaro
    Parody, Luisa
    Jesus Varela-Vaca, Angel
    Caballero, Ismael
    Teresa Gomez-Lopez, Maria
    BUSINESS PROCESS MANAGEMENT WORKSHOPS (BPM 2019), 2019, 362 : 362 - 374
  • [40] Quality assessment of affymetrix GeneChip data
    Heber, Steffen
    Sick, Beate
    OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2006, 10 (03) : 358 - 368