Citizens' data afterlives: Practices of dataset inclusion in machine learning for public welfare

被引:0
|
作者
Ratner, Helene Friis [1 ,2 ]
Thylstrup, Nanna Bonde [2 ]
机构
[1] Aarhus Univ, Danish Sch Educ DPU, Tuborgvej 164, DK-2400 Copenhagen N, Denmark
[2] Univ Copenhagen, Dept Arts & Cultural Studies, Karen Blixensvej 1, DK-2300 Copenhagen, Denmark
关键词
Machine learning; Welfare state; Data afterlives; Dataset negotiations; DATABASES; CHILD; CARE;
D O I
10.1007/s00146-024-01920-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Public sector adoption of AI techniques in welfare systems recasts historic national data as resource for machine learning. In this paper, we examine how the use of register data for development of predictive models produces new 'afterlives' for citizen data. First, we document a Danish research project's practical efforts to develop an algorithmic decision-support model for social workers to classify children's risk of maltreatment. Second, we outline the tensions emerging from project members' negotiations about which datasets to include. Third, we identify three types of afterlives for citizen data in machine learning projects: (1) data afterlives for training and testing the algorithm, acting as 'ground truth' for inferring futures, (2) data afterlives for validating the algorithmic model, acting as markers of robustness, and (3) data afterlives for improving the model's fairness, valuated for reasons of data ethics. We conclude by discussing how, on one hand, these afterlives engender new ethical relations between state and citizens; and how they, on the other hand, also articulate an alternative view on the value of datasets, posing interesting contrasts between machine learning projects developed within the context of the Danish welfare state and mainstream corporate AI discourses of the bigger, the better.
引用
收藏
页码:1183 / 1193
页数:11
相关论文
共 50 条
  • [31] Application of Big Data Analytics and Machine Learning to Large-Scale Synchrophasor Datasets: Evaluation of Dataset 'Machine Learning-Readiness'
    Hart, Philip
    He, Lijun
    Wang, Tianyi
    Kumar, Vijay S.
    Aggour, Kareem
    Subramanian, Arun
    Yan, Weizhong
    IEEE OPEN ACCESS JOURNAL OF POWER AND ENERGY, 2022, 9 : 386 - 397
  • [32] Estimation of gestating sows’ welfare status based on machine learning methods and behavioral data
    Maëva Durand
    Christine Largouët
    Louis Bonneau de Beaufort
    Jean-Yves Dourmad
    Charlotte Gaillard
    Scientific Reports, 13 (1)
  • [33] Estimation of gestating sows' welfare status based on machine learning methods and behavioral data
    Durand, Maeva
    Largouet, Christine
    de Beaufort, Louis Bonneau
    Dourmad, Jean-Yves
    Gaillard, Charlotte
    SCIENTIFIC REPORTS, 2023, 13 (01):
  • [34] Implicit data crimes: Machine learning bias arising from misuse of public data
    Shimron, Efrat
    Tamir, Jonathan, I
    Wang, Ke
    Lustig, Michael
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (13)
  • [35] Water consumption prediction based on machine learning methods and public data
    Kesornsit, Witwisit
    Sirisathitkul, Yaowarat
    ADVANCES IN COMPUTATIONAL DESIGN, AN INTERNATIONAL JOURNAL, 2022, 7 (02): : 113 - 128
  • [36] Predicting power plant emissions using public data and machine learning
    Gu, Jiajun
    Sward, Jeffrey A.
    Zhang, K. Max
    ENVIRONMENTAL SCIENCE-ADVANCES, 2023, 2 (12): : 1696 - 1707
  • [37] FEASIBILITY OF AUTOMATED MACHINE LEARNING USING PUBLIC DATA FOR USE IN ENDOSCOPY
    Mahajan, Neal
    Murali, Sriya
    Holzwanger, Erik A.
    Berzin, Tyler M.
    Brown, Jeremy Glissen
    GASTROENTEROLOGY, 2023, 164 (06) : S215 - S216
  • [38] An approach using machine learning and public data to detect traffic jams
    Saraiva, Tiago do Vale
    Vieira Campos, Carlos Alberto
    IWCMC 2021: 2021 17TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2021, : 675 - 680
  • [39] When is Machine Learning Data Good?: Valuing in Public Health Datafication
    Thakkar, Divy
    Ismail, Azra
    Kumar, Pratyush
    Hanna, Alex
    Sambasivan, Nithya
    Kumar, Neha
    PROCEEDINGS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI' 22), 2022,
  • [40] Specifics of Data Collection and Data Processing during Formation of RailVista Dataset for Machine Learning- and Deep Learning-Based Applications
    Abisheva, Gulsipat
    Goranin, Nikolaj
    Razakhova, Bibigul
    Aidynov, Tolegen
    Satybaldina, Dina
    SENSORS, 2024, 24 (16)