On the Timed Analysis of Big-Data Applications

被引:5
|
作者
Marconi, Francesco [1 ]
Quattrocchi, Giovanni [1 ]
Baresi, Luciano [1 ]
Bersani, Marcello M. [1 ]
Rossi, Matteo [1 ]
机构
[1] Politecn Milan, DEIB, Milan, Italy
来源
NASA FORMAL METHODS, NFM 2018 | 2018年 / 10811卷
基金
欧盟地平线“2020”;
关键词
Big-Data Applications; Metric temporal logic; Formal verification; Apache Spark;
D O I
10.1007/978-3-319-77935-5_22
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Apache Spark is one of the best-known frameworks for executing big-data batch applications over a cluster of (virtual) machines. Defining the cluster (i.e., the number of machines and CPUs) to attain guarantees on the execution times (deadlines) of the application is indeed a trade-off between the cost of the infrastructure and the time needed to execute the application. Sizing the computational resources, in order to prevent cost overruns, can benefit from the use of formal models as a means to capture the execution time of applications. Our model of Spark applications, based on the CLTLoc logic, is defined by considering the directed acyclic graph around which Spark programs are organized, the number of available CPUs, the number of tasks elaborated by the application, and the average execution times of tasks. If the outcome of the analysis is positive, then the execution is feasible-that is, it can be completed within a given time span. The analysis tool has been implemented on top of the Zot formal verification tool. A preliminary evaluation shows that our model is sufficiently accurate: the formal analysis identifies execution times that are close (the error is less than 10%) to those obtained by actually running the applications.
引用
收藏
页码:315 / 332
页数:18
相关论文
共 50 条
  • [21] Near real-time big-data processing for data driven applications
    Kampars, Janis
    Grabis, Janis
    2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA INNOVATIONS AND APPLICATIONS (INNOVATE-DATA), 2017, : 35 - 42
  • [22] Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework
    Zhao, Yaxiong
    Wu, Jie
    Liu, Cong
    TSINGHUA SCIENCE AND TECHNOLOGY, 2014, 19 (01) : 39 - 50
  • [23] Dache: A Data Aware Caching for Big-Data Applications Using The MapReduce Framework
    Zhao, Yaxiong
    Wu, Jie
    2013 PROCEEDINGS IEEE INFOCOM, 2013, : 35 - 39
  • [24] Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework
    Yaxiong Zhao
    Jie Wu
    Cong Liu
    TsinghuaScienceandTechnology, 2014, 19 (01) : 39 - 50
  • [25] Dache: A data aware caching for big-data applications using the MapReduce framework
    Zhao, Y. (yaxiongzhao@google.com), 1600, Tsinghua University (19):
  • [26] Neurotrauma as a big-data problem
    Huie, J. Russell
    Almeida, Carlos A.
    Ferguson, Adam R.
    CURRENT OPINION IN NEUROLOGY, 2018, 31 (06) : 702 - 708
  • [27] BigCache for Big-data Systems
    Roger, Michel Angelo
    Xu, Yiqi
    Zhao, Ming
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 189 - 194
  • [28] 'Big-Data' in dermatological research
    Kaliyadan, Feroze
    Chatterjee, Kingshuk
    INDIAN JOURNAL OF DERMATOLOGY VENEREOLOGY & LEPROLOGY, 2024, 90 (03): : 342 - 344
  • [29] Lessons for big-data projects
    Birney, Ewan
    NATURE, 2012, 489 (7414) : 49 - 51
  • [30] Editorial Note: HCI Systems for Big-Data Based Multimedia Applications
    Multimedia Tools and Applications, 2017, 76 : 25159 - 25159