Lill-DATA - A Framework for Traceable Active Learning Projects

被引:1
|
作者
Stieler, Fabian [1 ,3 ]
Elia, Miriam [1 ]
Weigell, Benjamin [1 ]
Bauer, Bernhard [1 ,3 ]
Kienle, Peter [2 ]
Roth, Anton [2 ]
Muellegger, Gregor [2 ]
Nann, Marius [2 ]
Dopfer, Sarah [2 ]
机构
[1] Univ Augsburg, Inst Comp Sci, Augsburg, Germany
[2] GS Elekt Med Gerate G Stemple GmbH, Kaufering, Germany
[3] Ctr Responsible AI Technol, Munich, Germany
关键词
Active Learning; Data Labeling; Traceability; Data-Centric AI; !text type='Python']Python[!/text] Framework; Open Source; MODEL;
D O I
10.1109/REW57809.2023.00088
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Active Learning has become a popular method for iteratively improving data -intensive Artificial Intelligence models. However, it often presents a significant challenge when dealing with large volumes of volatile data in projects, as with an Active Learning loop. This paper introduces LIFEDATA, a Python-based framework designed to assist developers in implementing Active Learning projects focusing on traceability. It supports seamless tracking of all artifacts, from data selection and labeling to model interpretation, thus promoting transparency throughout the entire model learning process and enhancing error debugging efficiency while ensuring experiment reproducibility. To showcase its applicability, we present two life science use cases. Moreover, the paper proposes an algorithm that combines query strategies to demonstrate LIFEDATA's ability to reduce data labeling effort.
引用
收藏
页码:465 / 474
页数:10
相关论文
共 50 条
  • [1] Secure and Traceable Framework for Data Circulation
    Liang, Kaitai
    Miyaji, Atsuko
    Su, Chunhua
    INFORMATION SECURITY AND PRIVACY, PT I, 2016, 9722 : 376 - 388
  • [2] Graph Deep Active Learning Framework for Data Deduplication
    Cao, Huan
    Du, Shengdong
    Hu, Jie
    Yang, Yan
    Horng, Shi-Jinn
    Li, Tianrui
    BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 753 - 764
  • [3] Active Learning for Streaming Data in A Contextual Bandit Framework
    Song, Linqi
    Xu, Jie
    Li, Congduan
    ICCDE 2019: PROCEEDINGS OF THE 2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING AND DATA ENGINEERING, 2019, : 29 - 35
  • [4] Online Active Learning Ensemble Framework for Drifted Data Streams
    Shan, Jicheng
    Zhang, Hang
    Liu, Weike
    Liu, Qingbao
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (02) : 486 - 498
  • [5] Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision
    Kaushal, Vishal
    Iyer, Rishabh
    Kothawade, Suraj
    Mahadev, Rohan
    Doctor, Khoshrav
    Ramakrishnan, Ganesh
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1289 - 1299
  • [6] Framework for Structuring Big Data Projects
    Grander, Gustavo
    Ferreira Da Silva, Luciano
    Santibanez Gonzalez, Ernesto Del Rosario
    Penha, Renato
    ELECTRONICS, 2022, 11 (21)
  • [7] A Framework for Describing Big Data Projects
    Saltz, Jeffrey
    Shamshurin, Ivan
    Connors, Colin
    BUSINESS INFORMATION SYSTEMS WORKSHOPS, BIS 2016, 2017, 263 : 183 - 195
  • [8] Learning to Sample: an Active Learning Framework
    Shao, Jingyu
    Wang, Qing
    Liu, Fangbing
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 538 - 547
  • [9] Properties of a GP Active Learning Framework for Streaming Data with Class Imbalance
    Khanchi, Sara
    Heywood, Malcolm I.
    Zincir-Heywood, A. Nur
    PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17), 2017, : 945 - 952
  • [10] A unified active learning framework for annotating graph data for regression task
    Samoaa, Peter
    Aronsson, Linus
    Longa, Antonio
    Leitner, Philipp
    Chehreghani, Morteza Haghir
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138