Prov-Dominoes: An approach for knowledge discovery from provenance data

被引:0
|
作者
Alencar, Victor [1 ]
Kohwalter, Troy [2 ]
Braganholo, Vanessa [2 ]
Da Silva Junior, Jose Ricardo [3 ,4 ]
Murta, Leonardo [2 ]
机构
[1] CASNAV, Brazilian Navy, Rio De Janeiro, RJ, Brazil
[2] Univ Fed Fluminense, Inst Computacao, Niteroi, RJ, Brazil
[3] IFRJ, Dept Computacao, Niteroi, RJ, Brazil
[4] Inst Fed Rio Janeiro, Niteroi, RJ, Brazil
关键词
Knowledge discovery; Data analysis; Provenance; Gpu computing; VISUALIZATION; MODEL;
D O I
10.1016/j.eswa.2023.123030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Provenance has become increasingly relevant to understanding, auditing, and reproducing computational tasks. The provenance analysis processes can often be overwhelming to the user due to the large volume of data, the multiple relationships among data, and the implicit information buried in the data. Existing provenance analysis tools use either visual exploration (which is overwhelming for large provenance graphs) or do not support the exploration of implicit provenance data, such as the inferences of the PROV Data Model Constraints. To fill in this gap, we introduce Prov-Dominoes, a tool designed to interactively enable knowledge discovery on provenance data. Prov-Dominoes promotes the provenance relationships among entities, activities, and agents into first-class elements represented by domino tiles. It allows users to combine and compose such domino tiles visually and interactively, using GPU. The benefits of Prov-Dominoes are three-fold: first, it uses matrices to display provenance data, which is more compact than graphs; second, it allows users to easily explore implicit information; third, it is capable of efficiently processing large datasets using GPUs. We evaluated Prov-Dominoes over distinct case studies, allowing the observation of Prov-Dominoes in action. We also evaluated the performance of sequential combinations executed in Prov-Dominoes when dealing with provenance data with thousands of relations, contrasting their executions in GPU and CPU. The results showed that, for a large dataset, GPU was more than a hundred times faster than CPU.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] PROV-IDEA: Supporting Interoperable Schema and Data Provenance within Database Evolution
    Perez, Beatriz
    Garcia, aNGEL L. U. I. S. R. U. B. I. O.
    Zapata, Maria A.
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2025, 34 (03)
  • [22] Prov Viewer: A Graph-Based Visualization Tool for Interactive Exploration of Provenance Data
    Kohwalter, Troy
    Oliveira, Thiago
    Freire, Juliana
    Clua, Esteban
    Murta, Leonardo
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, IPAW 2016, 2016, 9672 : 71 - 82
  • [23] Prov-Trust: Towards a Trustworthy SGX-based Data Provenance System
    Kaaniche, Nesrine
    Belguith, Sana
    Laurent, Maryline
    Gehani, Ashish
    Russello, Giovanni
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON E-BUSINESS AND TELECOMMUNICATIONS (SECRYPT), VOL 1, 2020, : 225 - 237
  • [24] A hybrid and exploratory approach to knowledge discovery in metabolomic data
    Grissa, Dhouha
    Comte, Blandine
    Petera, Melanie
    Pujos-Guillot, Estelle
    Napoli, Amedeo
    DISCRETE APPLIED MATHEMATICS, 2020, 273 (273) : 103 - 116
  • [25] Knowledge Discovery in the Social Sciences: A Data Mining Approach
    Nelson, Laura K.
    CONTEMPORARY SOCIOLOGY-A JOURNAL OF REVIEWS, 2021, 50 (04) : 346 - 348
  • [26] MembershipMap: A data transformation approach for knowledge discovery in databases
    Frigui, H
    2004 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, PROCEEDINGS, 2004, : 1147 - 1152
  • [27] MLflow2PROV: Extracting Provenance from Machine Learning Experiments
    Schlegel, Marius
    Sattler, Kai-Uwe
    PROCEEDINGS OF THE SEVENTH WORKSHOP ON DATA MANAGEMENT FOR END-TO-END MACHINE LEARNING, DEEM, 2023,
  • [28] Provenance Comparison for Large-Scale Knowledge Discovery
    Zhao, Xiang
    Ge, Bin
    Tang, Jiuyang
    Xiao, Weidong
    Shang, Haichuan
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [29] Knowledge discovery from diagrammatically represented data
    Anderson, M
    2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 19 - 26
  • [30] Knowledge discovery from data streams Introduction
    Gama, Joao
    Ganguly, Auroop
    Omitaomu, Olufemi
    Vatsavai, Raju
    Gaber, Mohamed
    INTELLIGENT DATA ANALYSIS, 2009, 13 (03) : 403 - 404