On the Timed Analysis of Big-Data Applications

被引:5
|
作者
Marconi, Francesco [1 ]
Quattrocchi, Giovanni [1 ]
Baresi, Luciano [1 ]
Bersani, Marcello M. [1 ]
Rossi, Matteo [1 ]
机构
[1] Politecn Milan, DEIB, Milan, Italy
来源
NASA FORMAL METHODS, NFM 2018 | 2018年 / 10811卷
基金
欧盟地平线“2020”;
关键词
Big-Data Applications; Metric temporal logic; Formal verification; Apache Spark;
D O I
10.1007/978-3-319-77935-5_22
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Apache Spark is one of the best-known frameworks for executing big-data batch applications over a cluster of (virtual) machines. Defining the cluster (i.e., the number of machines and CPUs) to attain guarantees on the execution times (deadlines) of the application is indeed a trade-off between the cost of the infrastructure and the time needed to execute the application. Sizing the computational resources, in order to prevent cost overruns, can benefit from the use of formal models as a means to capture the execution time of applications. Our model of Spark applications, based on the CLTLoc logic, is defined by considering the directed acyclic graph around which Spark programs are organized, the number of available CPUs, the number of tasks elaborated by the application, and the average execution times of tasks. If the outcome of the analysis is positive, then the execution is feasible-that is, it can be completed within a given time span. The analysis tool has been implemented on top of the Zot formal verification tool. A preliminary evaluation shows that our model is sufficiently accurate: the formal analysis identifies execution times that are close (the error is less than 10%) to those obtained by actually running the applications.
引用
收藏
页码:315 / 332
页数:18
相关论文
共 50 条
  • [1] Big-Data Applications in the Government Sector
    Kim, Gang-Hoon
    Trimi, Silvana
    Chung, Ji-Hyong
    COMMUNICATIONS OF THE ACM, 2014, 57 (03) : 78 - 85
  • [2] Profiling Memory Vulnerability of Big-data Applications
    Rameshan, N.
    Birke, R.
    Navarro, L.
    Vlassov, V.
    Urgaonkar, B.
    Kesidis, G.
    Schmatz, M.
    Chen, L. Y.
    2016 46TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOPS (DSN-W), 2016, : 258 - 261
  • [3] A Data Reconstruction Method for The Big-Data Analysis
    Mito, Masataka
    Murata, Kenya
    Eguchi, Daisuke
    Mori, Yuichiro
    Toyonaga, Masahiko
    2018 9TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST), 2018, : 319 - 323
  • [4] From photons to big-data applications: terminating terabits
    Zilberman, Noa
    Moore, Andrew W.
    Crowcroft, Jon A.
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2016, 374 (2062):
  • [5] Advances in modelling and simulation for big-data applications (AMSBA)
    Pop, Florin
    Iacono, Mauro
    Gribaudo, Marco
    Kolodziej, Joanna
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (02): : 291 - 293
  • [6] Interpreting big-data analysis of retrospective observational data
    Huizinga, Tom W. J.
    Knevel, Rachel
    LANCET RHEUMATOLOGY, 2020, 2 (11): : E652 - E653
  • [7] Analysis of Big-Data Based Data Mining Engine
    Huang, Xinxin
    Gong, Shu
    2017 13TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2017, : 164 - 168
  • [8] Failure Analysis and Prediction for Big-Data Systems
    Rosa, Andrea
    Chen, Lydia Y.
    Binder, Walter
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2017, 10 (06) : 984 - 998
  • [9] Big-Data Visualization
    Keim, Daniel
    Qu, Huamin
    Ma, Kwan-Liu
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2013, 33 (04) : 20 - 21
  • [10] Optimizing Read-Once Data Flow in Big-Data Applications
    Morad, Tomer Y.
    Shomron, Gil
    Erez, Mattan
    Kolodny, Avinoam
    Weiser, Uri C.
    IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 68 - 71