Reliability-Aware Distributed Computing Scheduling Policy

被引:0
|
作者
Abawajy, Jemal [1 ]
Hassan, Mohammad Mehedi [2 ]
机构
[1] Deakin Univ, Sch Informat Technol, Fac Sci & Technol, Geelong, Vic 3217, Australia
[2] King Saud Univ, Coll Comp & Informat Sci, Dept Informat Syst, Riyadh 11543, Saudi Arabia
关键词
Cloud computing; Job scheduling; Fault-tolerance; Replication; Performances;
D O I
10.1007/978-3-319-27161-3_57
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the primary issues associated with the efficient and effective utilization of distributed computing is resource management and scheduling. As distributed computing resource failure is a common occurrence, the issue of deploying support for integrated scheduling and fault-tolerant approaches becomes paramount importance. To this end, we propose a fault-tolerant dynamic scheduling policy that loosely couples dynamic job scheduling with job replication scheme such that jobs are efficiently and reliably executed. The novelty of the proposed algorithm is that it uses passive replication approach under high system load and active replication approach under low system loads. The switch between these two replication methods is also done dynamically and transparently. Performance evaluation of the proposed fault-tolerant scheduler and a comparison with similar fault-tolerant scheduling policy is presented and shown that the proposed policy performs better than the existing approach.
引用
收藏
页码:627 / 632
页数:6
相关论文
共 50 条
  • [31] RELIABILITY-AWARE MICROARCHITECTURE DESIGN
    Reddi, Vijay Janapa
    IEEE MICRO, 2013, 33 (04) : 4 - 5
  • [32] Towards Intelligent Edge Computing: A Resource- and Reliability-Aware Hybrid Scheduling Method on Multi-FPGA Systems
    Li, Zeyu
    Hao, Yuchen
    Gao, Hongxu
    Zhou, Jia
    ELECTRONICS, 2025, 14 (01):
  • [33] Reliability-aware Speed Control Policy for Energy Reduction in Server Farms
    Tian, Yuan
    Lin, Chuang
    Huang, Jiwei
    Yao, Min
    2012 IEEE INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND COMMUNICATIONS, CONFERENCE ON INTERNET OF THINGS, AND CONFERENCE ON CYBER, PHYSICAL AND SOCIAL COMPUTING (GREENCOM 2012), 2012, : 508 - 514
  • [34] Reliability-aware low energy scheduling in real time systems with shared resources
    Zhang, Yi-wen
    Zhang, Hui-zhen
    Wang, Cheng
    MICROPROCESSORS AND MICROSYSTEMS, 2017, 52 : 312 - 324
  • [35] Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architectures
    Suleyman Tosun
    The Journal of Supercomputing, 2012, 62 : 265 - 289
  • [36] Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architectures
    Tosun, Suleyman
    JOURNAL OF SUPERCOMPUTING, 2012, 62 (01): : 265 - 289
  • [37] Reliability-Aware Workflow Scheduling Using Monte Carlo Failure Estimation in Cloud
    Rehani, Nidhi
    Garg, Ritu
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMMUNICATION AND NETWORKS, 2017, 508 : 139 - 153
  • [38] Soft and Hard Reliability-Aware Scheduling for Multicore Embedded Systems with Energy Harvesting
    Xiang, Yi
    Pasricha, Sudeep
    IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2015, 1 (04): : 220 - 235
  • [39] Latency Improvement Strategies for Reliability-Aware Scheduling in Industrial Wireless Sensor Networks
    Dobslaw, Felix
    Zhang, Tingting
    Gidlund, Mikael
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,
  • [40] Providing Reliability-Aware Virtualized Network Function Services for Mobile Edge Computing
    Li, Jing
    Liang, Weifa
    Huang, Meitian
    Jia, Xiaohua
    2019 39TH IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2019), 2019, : 732 - 741