Citizens' data afterlives: Practices of dataset inclusion in machine learning for public welfare

被引:0
|
作者
Ratner, Helene Friis [1 ,2 ]
Thylstrup, Nanna Bonde [2 ]
机构
[1] Aarhus Univ, Danish Sch Educ DPU, Tuborgvej 164, DK-2400 Copenhagen N, Denmark
[2] Univ Copenhagen, Dept Arts & Cultural Studies, Karen Blixensvej 1, DK-2300 Copenhagen, Denmark
关键词
Machine learning; Welfare state; Data afterlives; Dataset negotiations; DATABASES; CHILD; CARE;
D O I
10.1007/s00146-024-01920-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Public sector adoption of AI techniques in welfare systems recasts historic national data as resource for machine learning. In this paper, we examine how the use of register data for development of predictive models produces new 'afterlives' for citizen data. First, we document a Danish research project's practical efforts to develop an algorithmic decision-support model for social workers to classify children's risk of maltreatment. Second, we outline the tensions emerging from project members' negotiations about which datasets to include. Third, we identify three types of afterlives for citizen data in machine learning projects: (1) data afterlives for training and testing the algorithm, acting as 'ground truth' for inferring futures, (2) data afterlives for validating the algorithmic model, acting as markers of robustness, and (3) data afterlives for improving the model's fairness, valuated for reasons of data ethics. We conclude by discussing how, on one hand, these afterlives engender new ethical relations between state and citizens; and how they, on the other hand, also articulate an alternative view on the value of datasets, posing interesting contrasts between machine learning projects developed within the context of the Danish welfare state and mainstream corporate AI discourses of the bigger, the better.
引用
收藏
页码:1183 / 1193
页数:11
相关论文
共 50 条
  • [21] Applying Machine Learning to Predict Film Daily Audience Data: System and Dataset
    Jiang, Luyao
    Hao, Yu
    2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2020), 2020, : 11 - 16
  • [22] Co-ML: Collaborative Machine Learning Model Building for Developing Dataset Design Practices
    Tseng, Tiffany
    Davidson, Matt J.
    Morales-Navarro, Luis
    Chen, Jennifer King
    Delaney, Victoria
    Leibowitz, Mark
    Beason, Jazbo
    Shapiro, R. Benjamin
    ACM TRANSACTIONS ON COMPUTING EDUCATION, 2024, 24 (02):
  • [23] Big Data in Public Health: Terminology, Machine Learning, and Privacy
    Mooney, Stephen J.
    Pejaver, Vikas
    ANNUAL REVIEW OF PUBLIC HEALTH, VOL 39, 2018, 39 : 95 - 112
  • [24] FastMRI Prostate: A public, biparametric MRI dataset to advance machine learning for prostate cancer imaging
    Tibrewala, Radhika
    Dutt, Tarun
    Tong, Angela
    Ginocchio, Luke
    Lattanzi, Riccardo
    Keerthivasan, Mahesh B.
    Baete, Steven H.
    Chopra, Sumit
    Lui, Yvonne W.
    Sodickson, Daniel K.
    Chandarana, Hersh
    Johnson, Patricia M.
    SCIENTIFIC DATA, 2024, 11 (01)
  • [25] Usage and Role of Open Government Data and Public Policies of 54+Citizens e-Inclusion Issues
    Zdjelar, Robertina
    Hrustek, Nikolina Zajdela
    Vrcek, Neven
    CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS (CECIIS 2020), 2020, : 121 - 129
  • [26] Improved Practices in Machine Learning Algorithms for NTL Detection with Imbalanced Data
    Figueroa, Gerardo
    Chen, Yi-Shin
    Avila, Nelson
    Chu, Chia-Chi
    2017 IEEE POWER & ENERGY SOCIETY GENERAL MEETING, 2017,
  • [27] Machine Learning and Data Sciences for Financial Markets: A Guide to Contemporary Practices
    Filipovic, Damir
    Capponi, Agostino
    Lehalle, Charles-Albert
    QUANTITATIVE FINANCE, 2023, 23 (12) : 1729 - 1730
  • [28] The ripple effect of dataset reuse: Contextualising the data lifecycle for machine learning data sets and social impact
    Park, Jaihyun
    Cordell, Ryan
    JOURNAL OF INFORMATION SCIENCE, 2023,
  • [29] Data and its (dis)contents: A survey of dataset development and use in machine learning research
    Paullada, Amandalynne
    Raji, Inioluwa Deborah
    Bender, Emily M.
    Denton, Emily
    Hanna, Alex
    PATTERNS, 2021, 2 (11):
  • [30] Performance Analysis of Machine Learning Algorithms on Diabetes Dataset using Big Data Analytics
    Kumar, P. Suresh
    Pranavi, S.
    2017 INTERNATIONAL CONFERENCE ON INFOCOM TECHNOLOGIES AND UNMANNED SYSTEMS (TRENDS AND FUTURE DIRECTIONS) (ICTUS), 2017, : 508 - 513