MSR4ML: Reconstructing Artifact Traceability in Machine Learning Repositories

被引：6

作者：

Njomou, Aquilas Tchanjou ^{[1
]}

Africa, Alexandra Johanne Bifona ^{[1
]}

Adams, Bram ^{[1
]}

Fokaefs, Marios ^{[1
]}

机构：

[1] Polytech Montreal, Dept Comp & Software Engn, Montreal, PQ, Canada

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021) | 2021年

关键词：

Model Traceability; Machine Learning Operations; Mining Software Repositories; Model Mining; Metadata Extraction; Developer Productivity;

D O I：

10.1109/SANER50967.2021.00061

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

The increasing popularity of Machine Learning (ML) is generating challenges also for developers. The multitude of programming languages, libraries and available resources allow them to easily build their own models or algorithms. However, ML models are tightly connected to their data implying a different development process from other types of software. Software projects often rely on version control platforms, such as GitHub, but these platforms have not yet been extended to support ML projects. There is poor support for data versioning and no link between ML and software artifacts. Thus, traceability and model evolution can become challenging for developers. While some specific ML platforms exist, they still require considerable manual specification of ML artifacts and links between them. In this work, we propose a framework for automatic identification and traceability of links between data, code and ML model through Mining Software Repositories (MSR) techniques. Our tool combines static code analysis and mining commit data to identify ML, code and data artifacts, reconstruct links between them and retrieve commits that affect each end of the link. The objective is to increase productivity and the developers' awareness of their project through the recovered traceability.

引用

页码：536 / 540

页数：5

共 30 条

[1] ML4ML: Automated Invariance Testing for Machine Learning Models
Liao, Zukang
Zhang, Pengfei
Chen, Min
2022 FOURTH IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING (AITEST 2022), 2022, : 34 - 41
[2] Machine Learning for Health (ML4H) 2021
Roy, Subhrajit
Pfohl, Stephen
Tadesse, Girmaw Abebe
Oala, Luis
Falck, Fabian
Zhou, Yuyin
Shen, Liyue
Zamzmi, Ghada
Mugambi, Purity
Zirikly, Ayah
McDermott, Matthew B.A.
Alsentzer, Emily
Proceedings of Machine Learning Research, 2021, 158 : 1 - 12
[3] Machine Learning for Health (ML4H) 2022
Parziale, Antonio
Agrawal, Monica
Tang, Shengpu
Severson, Kristen
Oala, Luis
Subbaswamy, Adarsh
Kumar, Sayantan
Schoerverth, Elora
Hegselmann, Stefan
Zhou, Helen
Zamzmi, Ghada
Mugambi, Purity
Sizikova, Elena
Tadesse, Girmaw Abebe
Zhou, Yuyin
Killian, Taylor
Zhang, Haoran
Kamran, Fahad
Hobby, Andrea
Huang, Mars
Alaa, Ahmed
Singh, Harvineet
Chen, Irene Y.
Joshi, Shalmali
MACHINE LEARNING FOR HEALTH, VOL 193, 2022, 193 : 1 - 11
[4] Machine Learning for Health (ML4H) 2021
Roy, Subhrajit
Pfohl, Stephen
Tadesse, Girmaw Abebe
Oala, Luis
Falck, Fabian
Zhou, Yuyin
Shen, Liyue
Zamzmi, Ghada
Mugambi, Purity
Zirikly, Ayah
McDermott, Matthew B. A.
Alsentzer, Emily
MACHINE LEARNING FOR HEALTH, VOL 158, 2021, 158 : 1 - 12
[5] Machine Learning for Health (ML4H) 2023
Hegselmann, Stefan
Parziale, Antonio
Shanmugam, Divya
Tang, Shengpu
Severson, Kristen
Asiedu, Mercy Nyamewaa
Chang, Serina
Dossou, Bonaventure F. P.
Huang, Qian
Kamran, Fahad
Zhang, Haoran
Nagaraj, Sujay
Oala, Luis
Xu, Shan
Okolo, Chinasa T.
Zhou, Helen
Dafflon, Jessica
Ellington, Caleb
Jabbour, Sarah
Jeong, Hyewon
Nieva, Harry Reyes
Yang, Yuzhe
Zamzmi, Ghada
Mhasawade, Vishwali
Truong, Van
Chandak, Payal
Lee, Matthew
Argaw, Peniel
Heuton, Kyle
Singh, Harvineet
Hartvigsen, Thomas
MACHINE LEARNING FOR HEALTH, ML4H, VOL 225, 2023, 225 : 1 - 12
[6] Machine Learning for Health (ML4H) 2019: What Makes Machine Learning in Medicine Different?
Dalca, Adrian V.
Mcdermott, Matthew
Alsentzer, Emily
Finlayson, Sam
Oberst, Michael
Falck, Fabian
Chivers, Corey
Beam, Andrew L.
Naumann, Tristan
Beaulieu-Jones, Brett
MACHINE LEARNING FOR HEALTH WORKSHOP, VOL 116, 2019, 116 : 1 - 9
[7] QoA4ML-A Framework for Supporting Contracts in Machine Learning Services
Truong, Hong-Linh
Nguyen, Tri-Minh
2021 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, ICWS 2021, 2021, : 465 - 475
[8] Demonstration Paper: Monitoring Machine Learning Contracts with QoA4ML
Minh-Tri Nguyen
Hong-Linh Truong
COMPANION OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE 2021, 2021, : 169 - 170
[9] VIS4ML: An Ontology for Visual Analytics Assisted Machine Learning
Sacha, Dominik
Kraus, Matthias
Keim, Daniel A.
Chen, Min
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (01) : 385 - 395
[10] MLP4ML: Machine Learning Service Recommendation System using MLP
Alghofaily, Bayan
Ding, Chen
2020 IEEE 13TH INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2020), 2020, : 84 - 91

← 1 2 3 →