A hierarchical reliability-driven scheduling algorithm in grid systems

被引:106
|
作者
Tang, Xiaoyong [1 ]
Li, Kenli [1 ]
Qiu, Meikang [2 ]
Sha, Edwin H. -M. [1 ,3 ]
机构
[1] Hunan Univ, Sch Informat Sci & Engn, Natl Supercomp Ctr Changsha, Changsha 410082, Hunan, Peoples R China
[2] Univ Kentucky, Lexington, KY 40506 USA
[3] Univ Texas Dallas, Dept Comp Sci, Dallas, TX 75230 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Grid computing; Hierarchical; Scheduling algorithm; Reliability; Application; TASK-ALLOCATION ALGORITHMS; INDEPENDENT TASKS; MAXIMIZING RELIABILITY; PERFORMANCE; MODEL;
D O I
10.1016/j.jpdc.2011.12.004
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In a Grid computing system, many distributed scientific and engineering applications often require multi-institutional collaboration, large-scale resource sharing, wide-area communication, etc. Applications executing in such systems inevitably encounter different types of failures such as hardware failure, program failure, and storage failure. One way of taking failures into account is to employ a reliable scheduling algorithm. However, most existing Grid scheduling algorithms do not adequately consider the reliability requirements of an application. In recognition of this problem, we design a hierarchical reliability-driven scheduling architecture that includes both a local scheduler and a global scheduler. The local scheduler aims to effectively measure task reliability of an application in a Grid virtual node and incorporate the precedence constrained tasks' reliability overhead into a heuristic scheduling algorithm. In the global scheduler, we propose a hierarchical reliability-driven scheduling algorithm based on quantitative evaluation of independent application reliability. Our experiments, based on both randomly generated graphs and the graphs of some real applications, show that our hierarchical scheduling algorithm performs much better than the existing scheduling algorithms in terms of system reliability, schedule length, and speedup. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:525 / 535
页数:11
相关论文
共 50 条
  • [31] Autotuning control structures for reliability-driven dynamic binding
    Filieri, Antonio
    Ghezzi, Carlo
    Leva, Alberto
    Maggio, Martina
    2012 IEEE 51ST ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2012, : 418 - 423
  • [32] Reliability-driven sensor selection via observability indices
    Mathews, H. Kirk
    Kammer, Leonardo C.
    2007 AMERICAN CONTROL CONFERENCE, VOLS 1-13, 2007, : 3583 - 3588
  • [33] An online scheduling algorithm for Grid computing systems
    Du Kim, H
    Kim, JS
    GRID AND COOPERATIVE COMPUTING, PT 2, 2004, 3033 : 34 - 39
  • [34] An Interconnect Reliability-Driven Routing Technique for Electromigration Failure Avoidance
    Chen, Xiaodao
    Liao, Chen
    Wei, Tongquan
    Hu, Shiyan
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2012, 9 (05) : 770 - 776
  • [35] Reliability-Driven Task Mapping for Lifetime Extension of Networks-on-Chip Based Multiprocessor Systems
    Das, Anup
    Kumar, Akash
    Veeravalli, Bharadwaj
    DESIGN, AUTOMATION & TEST IN EUROPE, 2013, : 689 - 694
  • [36] Survivability and makespan driven scheduling algorithm for grid workflow applications
    Wang, Shu-Peng
    Yun, Xiao-Chun
    Yu, Xiang-Zhan
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2007, 23 (04) : 1299 - 1313
  • [37] Reliability-Driven Neural Network Training for Memristive Crossbar-Based Neuromorphic Computing Systems
    Wang, Junpeng
    Xu, Qi
    Yuan, Bo
    Chen, Song
    Yu, Bei
    Wu, Feng
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [38] Adaptive hierarchical scheduling policy for enterprise grid computing systems
    Abawajy, J. H.
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2009, 32 (03) : 770 - 779
  • [39] Implementation of Hierarchical Scheduling Algorithm on Real-Time Grid Environment
    Nachankar, Abhishek P.
    Dharmik, R. C.
    1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 565 - 569
  • [40] Reliability-Driven Deployment in Energy-Harvesting Sensor Networks
    Yu, Xiaofan
    Song, Xueyang
    Cherkasova, Ludmila
    Rosing, Tajana Simunic
    2020 16TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2020,