Automating Common Data Science Matrix Transformations

被引:2
|
作者
Contreras-Ochando, Lidia [1 ]
Ferri, Cesar [1 ]
Hernandez-Orallo, Jose [1 ]
机构
[1] Univ Politecn Valencia, Valencian Res Inst Artificial Intelligence VrAIn, Valencia, Spain
关键词
Data science automation; Matrix transformation; Inductive programming; R programming language;
D O I
10.1007/978-3-030-43823-4_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Programming languages such as R or Python are commonplace in data science projects. However, transforming data is usually tricky and the composition of the right primitives (using the appropriate libraries) to get the most elegant code transformation is not always easy. In this paper, we present the first system that is able to automatically synthesise program snippets in R given an input data matrix and an output matrix, partially filled by the user representing the required transformation. We use the type information given by the dimensions of the matrix primitives (and other constraints) to reduce the combinatorial explosion of primitive compositions. We test the performance of our approach with a set of artificial data and real examples from Stack Overflow questions.
引用
收藏
页码:17 / 27
页数:11
相关论文
共 50 条
  • [1] Automating Data Science
    De Bie, Tijl
    De Raedt, Luc
    Hernandez-Orallo, Jose
    Hoos, Holger H.
    Smyth, Padhraic
    Williams, Christopher K., I
    COMMUNICATIONS OF THE ACM, 2022, 65 (03) : 76 - 87
  • [2] Automating Transformations in Data Vault Data Warehouse Loads
    Puonti, Mikko
    Raitalaakso, Timo
    Aho, Timo
    Mikkonen, Tommi
    INFORMATION MODELLING AND KNOWLEDGE BASES XXVIII, 2017, 292 : 215 - 230
  • [3] Automating Ad hoc Data Representation Transformations
    Ureche, Vlad
    Biboudis, Aggelos
    Smaragdakis, Yannis
    Odersky, Martin
    ACM SIGPLAN NOTICES, 2015, 50 (10) : 801 - 820
  • [4] Automating Open Science for Big Data
    Crosas, Merce
    King, Gary
    Honaker, James
    Sweeney, Latanya
    ANNALS OF THE AMERICAN ACADEMY OF POLITICAL AND SOCIAL SCIENCE, 2015, 659 (01): : 260 - 273
  • [5] Automating Bivariate Transformations
    Yang, Jeff X.
    Drew, John H.
    Leemis, Lawrence M.
    INFORMS JOURNAL ON COMPUTING, 2012, 24 (01) : 1 - 9
  • [6] Automating Science
    Waltz, David
    Buchanan, Bruce G.
    SCIENCE, 2009, 324 (5923) : 43 - 44
  • [7] AUTOMATING SCIENCE
    不详
    MECHANICAL ENGINEERING, 2009, 131 (06) : 13 - 13
  • [8] Automating Mathematical Program Transformations
    Agarwal, Ashish
    Bhat, Sooraj
    Gray, Alexander
    Grossmann, Ignacio E.
    PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES, PROCEEDINGS, 2010, 5937 : 134 - +
  • [9] Coxeter transformations and the geometry of the data matrix
    Kolmykov, VA
    SBORNIK MATHEMATICS, 2003, 194 (7-8) : 1069 - 1077
  • [10] Automating Scholarly Article Data Collection with Action Science Explorer
    Amjad, Sehrish
    Mukhtar, Hamid
    Dunne, Cody
    2014 INTERNATIONAL CONFERENCE ON OPEN SOURCE SYSTEMS AND TECHNOLOGIES (ICOSST), 2014, : 160 - 169