Citizens' data afterlives: Practices of dataset inclusion in machine learning for public welfare

被引:0
|
作者
Ratner, Helene Friis [1 ,2 ]
Thylstrup, Nanna Bonde [2 ]
机构
[1] Aarhus Univ, Danish Sch Educ DPU, Tuborgvej 164, DK-2400 Copenhagen N, Denmark
[2] Univ Copenhagen, Dept Arts & Cultural Studies, Karen Blixensvej 1, DK-2300 Copenhagen, Denmark
关键词
Machine learning; Welfare state; Data afterlives; Dataset negotiations; DATABASES; CHILD; CARE;
D O I
10.1007/s00146-024-01920-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Public sector adoption of AI techniques in welfare systems recasts historic national data as resource for machine learning. In this paper, we examine how the use of register data for development of predictive models produces new 'afterlives' for citizen data. First, we document a Danish research project's practical efforts to develop an algorithmic decision-support model for social workers to classify children's risk of maltreatment. Second, we outline the tensions emerging from project members' negotiations about which datasets to include. Third, we identify three types of afterlives for citizen data in machine learning projects: (1) data afterlives for training and testing the algorithm, acting as 'ground truth' for inferring futures, (2) data afterlives for validating the algorithmic model, acting as markers of robustness, and (3) data afterlives for improving the model's fairness, valuated for reasons of data ethics. We conclude by discussing how, on one hand, these afterlives engender new ethical relations between state and citizens; and how they, on the other hand, also articulate an alternative view on the value of datasets, posing interesting contrasts between machine learning projects developed within the context of the Danish welfare state and mainstream corporate AI discourses of the bigger, the better.
引用
收藏
页码:1183 / 1193
页数:11
相关论文
共 50 条
  • [1] Public debt and welfare with machine learning
    Zhu, Jingjing
    Huang, Tianyuan
    FINANCE RESEARCH LETTERS, 2024, 69
  • [2] Machine Learning Techniques for Intrusion Detection on Public Dataset
    Thanthrige, Udaya Sampath K. Perera Miriya
    Samarabandu, Jagath
    Wang, Xianbin
    2016 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2016,
  • [3] A machine learning dataset for FRB detection in raw data
    Xu, ZhiJun
    An, Tao
    Guo, ShaoGuang
    Lao, BaoQiang
    Lv, WeiJia
    Wu, XiaoCong
    SCIENTIA SINICA-PHYSICA MECHANICA & ASTRONOMICA, 2023, 53 (02)
  • [4] Runtime Data Layout Scheduling for Machine Learning Dataset
    You, Yang
    Demmel, James
    2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2017, : 452 - 461
  • [5] Exploratory Data Analysis and Machine Learning on Titanic Disaster Dataset
    Singh, Karman
    Nagpal, Renuka
    Sehgal, Rajni
    PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 320 - 326
  • [6] Reintroducing KAPD as a Dataset for Machine Learning and Data Mining Applications
    Seddiq, Yasser
    Meftah, Ali
    Alghamdi, Mansour
    Alotaibi, Yousef
    UKSIM-AMSS 10TH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS), 2016, : 70 - 74
  • [7] Curation of a Large Public Head and Neck Dataset for Machine Learning (RADCURE) Using An Automated Data Mining Platform
    Welch, M.
    Patel, T.
    Kazmierski, M.
    Marsilla, J.
    Huang, S.
    Kim, S.
    Rey-Mcintyre, K.
    O'Sullivan, B.
    Waldron, J.
    Becker, N.
    Bratman, S.
    Hope, A.
    Haibe-Kains, B.
    Tadic, T.
    MEDICAL PHYSICS, 2022, 49 (06) : E231 - E231
  • [8] The social construction of datasets: On the practices, processes, and challenges of dataset creation for machine learning
    Orr, Will
    Crawford, Kate
    NEW MEDIA & SOCIETY, 2024, 26 (09) : 4955 - 4972
  • [9] Machine Learning for Neurodegenerative Disorder - Diagnosis Survey of Practices and Launch of Benchmark Dataset
    Tagaris, Athanasios
    Kollias, Dimitrios
    Stafylopatis, Andreas
    Tagaris, Georgios
    Kollias, Stefanos
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2018, 27 (03)
  • [10] Forged handwriting verification: a public domain dataset for training machine learning models
    Monaro, Merylin
    Fietta, Valentina
    Curro, Valentina
    Lusetti, Giulia
    Sartori, Giuseppe
    Navarin, Nicolo
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,