Tails in the cloud: a survey and taxonomy of straggler management within large-scale cloud data centres

被引:17
|
作者
Gill, Sukhpal Singh [1 ]
Ouyang, Xue [2 ]
Garraghan, Peter [3 ]
机构
[1] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London, England
[2] Natl Univ Def Technol, Sch Elect Sci, Changsha, Peoples R China
[3] Univ Lancaster, Sch Comp & Commun, Lancaster, England
来源
JOURNAL OF SUPERCOMPUTING | 2020年 / 76卷 / 12期
基金
英国工程与自然科学研究理事会;
关键词
Computing; Stragglers; Cloud computing; Straggler management; Distributed systems; Cloud data centres;
D O I
10.1007/s11227-020-03241-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud computing systems are splitting compute- and data-intensive jobs into smaller tasks to execute them in a parallel manner using clusters to improve execution time. However, such systems at increasing scale are exposed to stragglers, whereby abnormally slow running tasks executing within a job substantially affect job performance completion. Such stragglers are a direct threat towards attaining fast execution of data-intensive jobs within cloud computing. Researchers have proposed an assortment of different mechanisms, frameworks, and management techniques to detect and mitigate stragglers both proactively and reactively. In this paper, we present a comprehensive review of straggler management techniques within large-scale cloud data centres. We provide a detailed taxonomy of straggler causes, as well as proposed management and mitigation techniques based on straggler characteristics and properties. From this systematic review, we outline several outstanding challenges and potential directions of possible future work for straggler research.
引用
收藏
页码:10050 / 10089
页数:40
相关论文
共 50 条
  • [1] Tails in the cloud: a survey and taxonomy of straggler management within large-scale cloud data centres
    Sukhpal Singh Gill
    Xue Ouyang
    Peter Garraghan
    The Journal of Supercomputing, 2020, 76 : 10050 - 10089
  • [2] A Survey of Large Scale Data Management Approaches in Cloud Environments
    Sakr, Sherif
    Liu, Anna
    Batista, Daniel M.
    Alomari, Mohammad
    IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2011, 13 (03): : 311 - 336
  • [3] Testing large-scale cloud management
    Citron, D.
    Zlotnick, A.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2011, 55 (06)
  • [4] Large-Scale Data Analysis on Cloud Systems
    Marozzo, Fabrizio
    Talia, Domenico
    Trunfio, Paolo
    ERCIM NEWS, 2012, (89): : 26 - 27
  • [5] Cloud2HDD: Large-Scale HDD Data Analysis on Cloud for Cloud Datacenters
    Zeydan, Engin
    Arslan, Suayb S.
    2020 23RD CONFERENCE ON INNOVATION IN CLOUDS, INTERNET AND NETWORKS AND WORKSHOPS (ICIN 2020), 2020, : 243 - 249
  • [6] Analyzing large-scale genomic data with cloud data lakes
    Weintraub, Grisha
    Hadar, Noam
    Gudes, Ehud
    Dolev, Shlomi
    Birk, Ohad S.
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, SYSTOR 2023, 2023, : 142 - 142
  • [7] Dynamic multidimensional index for large-scale cloud data
    He, Jing
    Wu, Yue
    Dong, Yunyun
    Zhang, Yunchun
    Zhou, Wei
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2016, 5
  • [8] Dynamic multidimensional index for large-scale cloud data
    Jing He
    Yue Wu
    Yunyun Dong
    Yunchun Zhang
    Wei Zhou
    Journal of Cloud Computing, 5
  • [9] Distributed Data Processing for Large-Scale Simulations on Cloud
    Lu, Tianjian
    Hoyer, Stephan
    Wang, Qing
    Hu, Lily
    Chen, Yi-Fan
    2021 JOINT IEEE INTERNATIONAL SYMPOSIUM ON ELECTROMAGNETIC COMPATIBILITY, SIGNAL & POWER INTEGRITY, AND EMC EUROPE (EMC+SIPI AND EMC EUROPE), 2021, : 53 - 58
  • [10] Large-Scale Docking in the Cloud
    Tingle, Benjamin I.
    Irwin, John J.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 63 (09) : 2735 - 2741