Prov-Dominoes: An approach for knowledge discovery from provenance data

被引:0
|
作者
Alencar, Victor [1 ]
Kohwalter, Troy [2 ]
Braganholo, Vanessa [2 ]
Da Silva Junior, Jose Ricardo [3 ,4 ]
Murta, Leonardo [2 ]
机构
[1] CASNAV, Brazilian Navy, Rio De Janeiro, RJ, Brazil
[2] Univ Fed Fluminense, Inst Computacao, Niteroi, RJ, Brazil
[3] IFRJ, Dept Computacao, Niteroi, RJ, Brazil
[4] Inst Fed Rio Janeiro, Niteroi, RJ, Brazil
关键词
Knowledge discovery; Data analysis; Provenance; Gpu computing; VISUALIZATION; MODEL;
D O I
10.1016/j.eswa.2023.123030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Provenance has become increasingly relevant to understanding, auditing, and reproducing computational tasks. The provenance analysis processes can often be overwhelming to the user due to the large volume of data, the multiple relationships among data, and the implicit information buried in the data. Existing provenance analysis tools use either visual exploration (which is overwhelming for large provenance graphs) or do not support the exploration of implicit provenance data, such as the inferences of the PROV Data Model Constraints. To fill in this gap, we introduce Prov-Dominoes, a tool designed to interactively enable knowledge discovery on provenance data. Prov-Dominoes promotes the provenance relationships among entities, activities, and agents into first-class elements represented by domino tiles. It allows users to combine and compose such domino tiles visually and interactively, using GPU. The benefits of Prov-Dominoes are three-fold: first, it uses matrices to display provenance data, which is more compact than graphs; second, it allows users to easily explore implicit information; third, it is capable of efficiently processing large datasets using GPUs. We evaluated Prov-Dominoes over distinct case studies, allowing the observation of Prov-Dominoes in action. We also evaluated the performance of sequential combinations executed in Prov-Dominoes when dealing with provenance data with thousands of relations, contrasting their executions in GPU and CPU. The results showed that, for a large dataset, GPU was more than a hundred times faster than CPU.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Research and Application of data provenance based on PROV
    Zhao, Yanpeng
    Dai, Chaofan
    Zhang, Xiaoyu
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS, 2015, 15 : 1551 - 1557
  • [2] Data Provenance Analysis and Description for ETL based on PROV
    Zhang Ran
    Dai Chao-fan
    Zeng Sai-hong
    2016 23RD ANNUAL INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING, VOLS. I AND II, 2016, : 1651 - 1656
  • [3] Abstracting PROV provenance graphs: A validity-preserving approach
    Missier, P.
    Bryans, J.
    Gamble, C.
    Curcin, V
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 : 352 - 367
  • [4] A PROV-O Based Approach to Web Content Provenance
    Jing, Ni
    2015 INTERNATIONAL CONFERENCE ON LOGISTICS, INFORMATICS AND SERVICE SCIENCES (LISS), 2015,
  • [5] Power Quality Data Analysis: From raw data to knowledge using knowledge discovery approach
    Santoso, S
    Lamoree, JD
    2000 IEEE POWER ENGINEERING SOCIETY SUMMER MEETING, CONFERENCE PROCEEDINGS, VOLS 1-4, 2000, : 172 - 177
  • [6] Knowledge discovery from data?
    Pazzani, MJ
    IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 2000, 15 (02): : 10 - 13
  • [7] Efficiently Comparing Provenance for Knowledge Discovery
    Tang, Jiuyang
    Zhao, Xiang
    Ge, Bin
    Xiao, Weidong
    Shang, Haichuan
    JOURNAL OF INTERNET TECHNOLOGY, 2014, 15 (06): : 963 - 974
  • [8] Knowledge discovery from data?
    Pazzani, Michael J.
    IEEE Intelligent Systems and Their Applications, 2000, 15 (02): : 10 - 13
  • [9] Extending PROV Data Model for Provenance-Aware Sensor Web
    Yue, Peng
    Guo, Xia
    Zhang, Mingda
    Jiang, Liangcun
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES (IPAW 2014), 2015, 8628 : 281 - 284
  • [10] An ontology engineering approach for knowledge discovery from data in evolving domains
    Gottgtroy, P
    Kasabov, N
    Macdonell, S
    DATA MINING IV, 2004, 7 : 43 - 52