DiffML: End-to-end Differentiable ML Pipelines

被引:3
|
作者
Hilprecht, Benjamin [1 ]
Hammacher, Christian [2 ]
Reis, Eduardo [1 ]
Abdelaal, Mohamed
Binnig, Carsten [1 ]
机构
[1] Tech Univ Darmstadt, Darmstadt, Germany
[2] Software AG, Mainz, Germany
来源
PROCEEDINGS OF THE SEVENTH WORKSHOP ON DATA MANAGEMENT FOR END-TO-END MACHINE LEARNING, DEEM | 2023年
关键词
data engineering; differentiable ML pipelines; data cleaning;
D O I
10.1145/3595360.3595857
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present our vision of differentiable ML pipelines called DiffML that truly allows to automate the construction of ML pipelines in an end-to-end fashion. DiffML allows to jointly train not just the ML model itself but also the entire pipeline including data engineering steps, e.g., data cleaning, data augmentation, etc. Our core idea is to formulate all steps in a differentiable way such that the entire pipeline can be trained using backpropagation. However, this is a non-trivial problem and opens up many new research questions. To show the feasibility of this direction, we demonstrate initial ideas and a general principle of how typical data engineering steps can be formulated as differentiable programs and jointly learned with the ML model. Moreover, we discuss a research roadmap and core challenges that have to be systematically tackled to enable fully differentiable ML pipelines.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] Looper: An End-to-end ML Platform for Product Decisions
    Markov, Igor L.
    Wang, Hanson
    Kasturi, Nitya S.
    Singh, Shaun
    Garrard, Mia R.
    Huang, Yin
    Yuen, Sze Wai
    Tran, Sarah
    Wang, Zehui
    Glotov, Igor
    Gupta, Tanvi
    Chen, Peng
    Huang, Boshuang
    Xie, Xiaowen
    Belkin, Michael
    Uryasev, Sal
    Howie, Sam
    Bakshy, Eytan
    Zhou, Norm
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 3513 - 3523
  • [22] CIRRUS: a Serverless Framework for End-to-end ML Workflows
    Carreira, Joao
    Fonseca, Pedro
    Tumanov, Alexey
    Zhang, Andrew
    Katz, Randy
    PROCEEDINGS OF THE 2019 TENTH ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC '19), 2019, : 13 - 24
  • [23] End-to-end reproducible AI pipelines in radiology using the cloud
    Bontempi, Dennis
    Nuernberg, Leonard
    Pai, Suraj
    Krishnaswamy, Deepa
    Thiriveedhi, Vamsi
    Hosny, Ahmed
    Mak, Raymond H.
    Farahani, Keyvan
    Kikinis, Ron
    Fedorov, Andrey
    Aerts, Hugo J. W. L.
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [24] End-to-end differentiable learning of turbulence models from indirect observations
    Carlos A.Michelén Str?fer
    Heng Xiao
    Theoretical & Applied Mechanics Letters, 2021, 11 (04) : 205 - 212
  • [25] partial derivativePV: An end-to-end differentiable solar-cell simulator
    Mann, Sean
    Fadel, Eric
    Schoenholz, Samuel S.
    Cubuk, Ekin D.
    Johnson, Steven G.
    Romano, Giuseppe
    COMPUTER PHYSICS COMMUNICATIONS, 2022, 272
  • [26] End-to-End Semi-supervised Learning for Differentiable Particle Filters
    Wen, Hao
    Chen, Xiongjie
    Papagiannis, Georgios
    Hu, Conghui
    Li, Yunpeng
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 5825 - 5831
  • [27] Toward the end-to-end optimization of particle physics instruments with differentiable programming
    Dorigo T.
    Giammanco A.
    Vischia P.
    Aehle M.
    Bawaj M.
    Boldyrev A.
    de Castro Manzano P.
    Derkach D.
    Donini J.
    Edelen A.
    Fanzago F.
    Gauger N.R.
    Glaser C.
    Baydin A.G.
    Heinrich L.
    Keidel R.
    Kieseler J.
    Krause C.
    Lagrange M.
    Lamparth M.
    Layer L.
    Maier G.
    Nardi F.
    Pettersen H.E.S.
    Ramos A.
    Ratnikov F.
    Röhrich D.
    de Austri R.R.
    del Árbol P.M.R.
    Savchenko O.
    Simpson N.
    Strong G.C.
    Taliercio A.
    Tosi M.
    Ustyuzhanin A.
    Zaraket H.
    Reviews in Physics, 2023, 10
  • [28] Expanding End-to-End Question Answering on Differentiable Knowledge Graphs with Intersection
    Sen, Priyanka
    Saffari, Amir
    Oliya, Armin
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8805 - 8812
  • [29] An End-to-End Differentiable Framework for Contact-Aware Robot Design
    Xu, Jie
    Chen, Tao
    Zlokapa, Lara
    Foshey, Michael
    Matusik, Wojciech
    Sueda, Shinjiro
    Agrawal, Pulkit
    ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
  • [30] End-to-end differentiable learning of turbulence models from indirect observations
    Strofer, Carlos A. Michelen
    Xiao, Heng
    THEORETICAL AND APPLIED MECHANICS LETTERS, 2021, 11 (04)