s2p: Provenance Research for Stream Processing System

被引:3
|
作者
Ye, Qian [1 ,2 ]
Lu, Minyan [1 ,2 ]
机构
[1] Beihang Univ, Key Lab Reliabil & Environm Engn Technol, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Reliabil & Syst Engn, Beijing 100191, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 12期
关键词
stream provenance; fine-grained provenance; coarse-grained provenance; replay; checkpoint; MAPREDUCE; MODEL;
D O I
10.3390/app11125523
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The main purpose of our provenance research for DSP (distributed stream processing) systems is to analyze abnormal results. Provenance for these systems is not nontrivial because of the ephemerality of stream data and instant data processing mode in modern DSP systems. Challenges include but are not limited to an optimization solution for avoiding excessive runtime overhead, reducing provenance-related data storage, and providing it in an easy-to-use fashion. Without any prior knowledge about which kinds of data may finally lead to the abnormal, we have to track all transformations in detail, which potentially causes hard system burden. This paper proposes s2p (Stream Process Provenance), which mainly consists of online provenance and offline provenance, to provide fine- and coarse-grained provenance in different precision. We base our design of s2p on the fact that, for a mature online DSP system, the abnormal results are rare, and the results that require a detailed analysis are even rarer. We also consider state transition in our provenance explanation. We implement s2p on Apache Flink named as s2p-flink and conduct three experiments to evaluate its scalability, efficiency, and overhead from end-to-end cost, throughput, and space overhead. Our evaluation shows that s2p-flink incurs a 13% to 32% cost overhead, 11% to 24% decline in throughput, and few additional space costs in the online provenance phase. Experiments also demonstrates the s2p-flink can scale well. A case study is presented to demonstrate the feasibility of the whole s2p solution.
引用
收藏
页数:33
相关论文
共 50 条
  • [1] S2p ⊆ ZPPNP
    Cai, JY
    42ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2001, : 620 - 628
  • [2] S2p⊆ZPPNP
    Cai, Jin-Yi
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2007, 73 (01) : 25 - 35
  • [3] REACTIONS OF COPPER(II) WITH [S2P(OET)2]- OR [S2P(OPR-I)2)- AND SINGLE-CRYSTAL X-RAY STUDIES OF CU[S2P(OET)2].BIPY AND CU[S2P(OET)2].2PPH3
    DREW, MGB
    FORSYTH, GA
    HASAN, M
    HOBSON, RJ
    RICE, DA
    JOURNAL OF THE CHEMICAL SOCIETY-DALTON TRANSACTIONS, 1987, (05): : 1027 - 1033
  • [4] A New Ionic Trimolybdenum Cluster Compound:〔Mo3S7(S2P(iprO)2)3〕〔S2P(iprO)2〕
    YU Rong-Min
    LU Shao-Fang
    HUANG Xiao-Ying
    WU Qiang-Jin
    HUANG Jian-Quan(State Key Laboratory of Structural Chemistry
    结构化学, 1998, (02) : 137 - 141
  • [5] The Contraction of S2p−1 to Hp−1
    A. H. Dooley
    S. K. Gupta
    Monatshefte für Mathematik, 1999, 128 : 237 - 253
  • [7] Big Provenance Stream Processing for Data Intensive Computations
    Suriarachchi, Isuru
    Withana, Sachith
    Plale, Beth A.
    2018 IEEE 14TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE 2018), 2018, : 245 - 255
  • [8] Advances and Challenges for Scalable Provenance in Stream Processing Systems
    Misra, Archan
    Blount, Marion
    Kementsietsidis, Anastasios
    Sow, Daby
    Wang, Min
    PROVENANCE AND ANNOTATION OF DATA AND PROCESSES, 2008, 5272 : 253 - 265
  • [9] Chemistry of cyclopentadienyl tricarbonylchromium dimer.: Cleavage of bis(thiophosphoryl)disulfanes.: Syntheses and X-ray crystal structures of CpCr(CO)2(S2P(OR)2), CpCr(S2P(OR)2)2 and Cr(S2P(OR)2)3 (R = iPr)
    Goh, LY
    Weng, ZQ
    Leong, WK
    Haiduc, I
    Lo, KM
    Wong, RCS
    JOURNAL OF ORGANOMETALLIC CHEMISTRY, 2001, 631 (1-2) : 67 - 75
  • [10] S2P农业服务模式发展浅析
    张贵
    中国林业产业, 2016, (01) : 139 - 140