Citizens' data afterlives: Practices of dataset inclusion in machine learning for public welfare

被引：0

作者：

Ratner, Helene Friis ^{[1
,2
]}

Thylstrup, Nanna Bonde ^{[2
]}

机构：

[1] Aarhus Univ, Danish Sch Educ DPU, Tuborgvej 164, DK-2400 Copenhagen N, Denmark

[2] Univ Copenhagen, Dept Arts & Cultural Studies, Karen Blixensvej 1, DK-2300 Copenhagen, Denmark

来源：

AI & SOCIETY | 2024年 / 40卷 / 3期

关键词：

Machine learning; Welfare state; Data afterlives; Dataset negotiations; DATABASES; CHILD; CARE;

D O I：

10.1007/s00146-024-01920-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Public sector adoption of AI techniques in welfare systems recasts historic national data as resource for machine learning. In this paper, we examine how the use of register data for development of predictive models produces new 'afterlives' for citizen data. First, we document a Danish research project's practical efforts to develop an algorithmic decision-support model for social workers to classify children's risk of maltreatment. Second, we outline the tensions emerging from project members' negotiations about which datasets to include. Third, we identify three types of afterlives for citizen data in machine learning projects: (1) data afterlives for training and testing the algorithm, acting as 'ground truth' for inferring futures, (2) data afterlives for validating the algorithmic model, acting as markers of robustness, and (3) data afterlives for improving the model's fairness, valuated for reasons of data ethics. We conclude by discussing how, on one hand, these afterlives engender new ethical relations between state and citizens; and how they, on the other hand, also articulate an alternative view on the value of datasets, posing interesting contrasts between machine learning projects developed within the context of the Danish welfare state and mainstream corporate AI discourses of the bigger, the better.

引用

页码：1183 / 1193

页数：11

共 50 条

[1] Public debt and welfare with machine learning
Zhu, Jingjing
Huang, Tianyuan
FINANCE RESEARCH LETTERS, 2024, 69
[2] Machine Learning Techniques for Intrusion Detection on Public Dataset
Thanthrige, Udaya Sampath K. Perera Miriya
Samarabandu, Jagath
Wang, Xianbin
2016 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2016,
[3] A machine learning dataset for FRB detection in raw data
Xu, ZhiJun
An, Tao
Guo, ShaoGuang
Lao, BaoQiang
Lv, WeiJia
Wu, XiaoCong
SCIENTIA SINICA-PHYSICA MECHANICA & ASTRONOMICA, 2023, 53 (02)
[4] Runtime Data Layout Scheduling for Machine Learning Dataset
You, Yang
Demmel, James
2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2017, : 452 - 461
[5] Exploratory Data Analysis and Machine Learning on Titanic Disaster Dataset
Singh, Karman
Nagpal, Renuka
Sehgal, Rajni
PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 320 - 326
[6] Reintroducing KAPD as a Dataset for Machine Learning and Data Mining Applications
Seddiq, Yasser
Meftah, Ali
Alghamdi, Mansour
Alotaibi, Yousef
UKSIM-AMSS 10TH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS), 2016, : 70 - 74
[7] Curation of a Large Public Head and Neck Dataset for Machine Learning (RADCURE) Using An Automated Data Mining Platform
Welch, M.
Patel, T.
Kazmierski, M.
Marsilla, J.
Huang, S.
Kim, S.
Rey-Mcintyre, K.
O'Sullivan, B.
Waldron, J.
Becker, N.
Bratman, S.
Hope, A.
Haibe-Kains, B.
Tadic, T.
MEDICAL PHYSICS, 2022, 49 (06) : E231 - E231
[8] The social construction of datasets: On the practices, processes, and challenges of dataset creation for machine learning
Orr, Will
Crawford, Kate
NEW MEDIA & SOCIETY, 2024, 26 (09) : 4955 - 4972
[9] Machine Learning for Neurodegenerative Disorder - Diagnosis Survey of Practices and Launch of Benchmark Dataset
Tagaris, Athanasios
Kollias, Dimitrios
Stafylopatis, Andreas
Tagaris, Georgios
Kollias, Stefanos
INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2018, 27 (03)
[10] Forged handwriting verification: a public domain dataset for training machine learning models
Monaro, Merylin
Fietta, Valentina
Curro, Valentina
Lusetti, Giulia
Sartori, Giuseppe
Navarin, Nicolo
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,

← 1 2 3 4 5 →