Prov-Dominoes: An approach for knowledge discovery from provenance data

被引:0
|
作者
Alencar, Victor [1 ]
Kohwalter, Troy [2 ]
Braganholo, Vanessa [2 ]
Da Silva Junior, Jose Ricardo [3 ,4 ]
Murta, Leonardo [2 ]
机构
[1] CASNAV, Brazilian Navy, Rio De Janeiro, RJ, Brazil
[2] Univ Fed Fluminense, Inst Computacao, Niteroi, RJ, Brazil
[3] IFRJ, Dept Computacao, Niteroi, RJ, Brazil
[4] Inst Fed Rio Janeiro, Niteroi, RJ, Brazil
关键词
Knowledge discovery; Data analysis; Provenance; Gpu computing; VISUALIZATION; MODEL;
D O I
10.1016/j.eswa.2023.123030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Provenance has become increasingly relevant to understanding, auditing, and reproducing computational tasks. The provenance analysis processes can often be overwhelming to the user due to the large volume of data, the multiple relationships among data, and the implicit information buried in the data. Existing provenance analysis tools use either visual exploration (which is overwhelming for large provenance graphs) or do not support the exploration of implicit provenance data, such as the inferences of the PROV Data Model Constraints. To fill in this gap, we introduce Prov-Dominoes, a tool designed to interactively enable knowledge discovery on provenance data. Prov-Dominoes promotes the provenance relationships among entities, activities, and agents into first-class elements represented by domino tiles. It allows users to combine and compose such domino tiles visually and interactively, using GPU. The benefits of Prov-Dominoes are three-fold: first, it uses matrices to display provenance data, which is more compact than graphs; second, it allows users to easily explore implicit information; third, it is capable of efficiently processing large datasets using GPUs. We evaluated Prov-Dominoes over distinct case studies, allowing the observation of Prov-Dominoes in action. We also evaluated the performance of sequential combinations executed in Prov-Dominoes when dealing with provenance data with thousands of relations, contrasting their executions in GPU and CPU. The results showed that, for a large dataset, GPU was more than a hundred times faster than CPU.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] From data mining to knowledge discovery in databases
    Fayyad, U
    PiatetskyShapiro, G
    Smyth, P
    AI MAGAZINE, 1996, 17 (03) : 37 - 54
  • [32] Knowledge Discovery from Mental Health Data
    Khan, Shahidul Islam
    Islam, Ariful
    Zahangir, Taiyeb Ibna
    Hoque, Abu Sayed Md Latiful
    PROCEEDING OF THE INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS, BIG DATA AND IOT (ICCBI-2018), 2020, 31 : 881 - 888
  • [33] Traffic Knowledge Discovery from AIS Data
    Pallotta, Giuliana
    Vespe, Michele
    Bryan, Karna
    2013 16TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2013, : 1996 - 2003
  • [34] PROV-TE: A Provenance-Driven Diagnostic Framework for Task Eviction in Data Centers
    Albatli, Abdulaziz
    McKee, David
    Townend, Paul
    Lau, Lydia
    Xu, Jie
    2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 233 - 242
  • [35] Knowledge Discovery from Social Graph Data
    Braun, Peter
    Cuzzocrea, Alfredo
    Leung, Carson K.
    Pazdor, Adam G. M.
    Tran, Kimberly
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS: PROCEEDINGS OF THE 20TH INTERNATIONAL CONFERENCE KES-2016, 2016, 96 : 682 - 691
  • [36] Knowledge discovery from imbalanced and noisy data
    Van Hulse, Jason
    Khoshgoftaar, Taghi
    DATA & KNOWLEDGE ENGINEERING, 2009, 68 (12) : 1513 - 1542
  • [37] Knowledge discovery process from sales data
    Yada, K
    INFORMATION TECHNOLOGY AND ORGANIZATIONS: TRENDS, ISSUES, CHALLENGES AND SOLUTIONS, VOLS 1 AND 2, 2003, : 684 - 687
  • [38] Knowledge Discovery from Earth Science Data
    Panigrahi, Sangram
    Verma, Kesari
    Tripathi, Priyanka
    Sharma, Rika
    2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 398 - 403
  • [39] Interpretable knowledge discovery from data with DC*
    Lucarelli, Marco
    Castiello, Ciro
    Fanelli, Anna M.
    Mencar, Corrado
    PROCEEDINGS OF THE 2015 CONFERENCE OF THE INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY, 2015, 89 : 815 - 822
  • [40] Collaborative knowledge discovery & data mining: From knowledge to experience
    Horeis, Timo
    Sick, Bernhard
    2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 421 - 428