Characterizing and modeling cloud applications/jobs on a Google data center

被引:35
|
作者
Di, Sheng [1 ]
Kondo, Derrick [1 ]
Cappello, Franck [2 ]
机构
[1] INRIA, Paris, France
[2] Argonne Natl Lab, Lemont, IL USA
来源
JOURNAL OF SUPERCOMPUTING | 2014年 / 69卷 / 01期
关键词
Google data center; Cloud task; Characterization and analysis; Large-scale system trace; COMPUTING ENVIRONMENTS;
D O I
10.1007/s11227-014-1131-z
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we characterize and model Google applications and jobs, based on a 1-month Google trace from a large-scale Google data center. We address four contributions: (1) we compute the valuable statistics about task events and resource utilization for Google applications, based on various types of resources and execution types; (2) we analyze the classification of applications via a K-means clustering algorithm with optimized number of sets, based on task events and resource usage; (3) we study the correlation of Google application properties and running features (e.g., job priority and scheduling class); (4) we finally build a model that can simulate Google jobs/tasks and dynamic events, in accordance with Google trace. Experiments show that the tasks simulated based on our model exhibit fairly analogous features with those in Google trace. 95+ % of tasks' simulation errors are 20 %, confirming a high accuracy of our simulation model.
引用
收藏
页码:139 / 160
页数:22
相关论文
共 50 条
  • [1] Characterizing and modeling cloud applications/jobs on a Google data center
    Sheng Di
    Derrick Kondo
    Franck Cappello
    The Journal of Supercomputing, 2014, 69 : 139 - 160
  • [2] Characterizing Cloud Applications on a Google Data Center
    Di, Sheng
    Kondo, Derrick
    Cappello, Franck
    2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 468 - 473
  • [3] Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud
    Tian, Huangshi
    Zheng, Yunchuan
    Wang, Wei
    PROCEEDINGS OF THE 2019 TENTH ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC '19), 2019, : 139 - 151
  • [4] GoCJ: Google Cloud Jobs Dataset for Distributed and Cloud Computing Infrastructures
    Hussain, Altaf
    Aleem, Muhammad
    DATA, 2018, 3 (04):
  • [5] Energy modeling based on cloud data center
    Luo, L. (luoliang@nlsde.buaa.edu.cn), 1600, Chinese Academy of Sciences (25):
  • [6] Threat Modeling for Cloud Data Center Infrastructures
    Alhebaishi, Nawaf
    Wang, Lingyu
    Jajodia, Sushil
    Singhal, Anoop
    FOUNDATIONS AND PRACTICE OF SECURITY, FPS 2016, 2017, 10128 : 302 - 319
  • [7] Data Analysis of a Google Data Center
    Minet, Pascale
    Renault, Eric
    Khoufi, Ines
    Boumerdassi, Selma
    2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2018, : 342 - 343
  • [8] Cloud Dependability Analysis: Characterizing Google Cluster Infrastructure Reliability
    Mesbahi, Mohammad Reza
    Rahmani, Amir Masoud
    Hosseinzadeh, Mehdi
    2017 3RD INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2017, : 56 - 61
  • [9] Bandwidth-Guaranteed Resource Allocation and Scheduling for Parallel Jobs in Cloud Data Center
    Li, Zhen
    Chen, Bin
    Liu, Xiaocheng
    Ning, Dandan
    Wei, Qihang
    Wang, Yiping
    Qiu, Xiaogang
    SYMMETRY-BASEL, 2018, 10 (05):
  • [10] Characterizing machines lifecycle in Google data centers
    Sebastio, Stefano
    Trivedi, Kishor S.
    Alonso, Javier
    PERFORMANCE EVALUATION, 2018, 126 : 39 - 63