Optimized Fault Tolerance as Services Provisioning for Cloud Applications

被引:0
|
作者
Yang N. [1 ]
Liu J. [1 ]
机构
[1] College of Computer Science, Inner Mongolia University, Hohhot
来源
Ruan Jian Xue Bao/Journal of Software | 2019年 / 30卷 / 04期
基金
中国国家自然科学基金;
关键词
Checkpoint fault tolerance; Cloud computing; Fault tolerance as a service; Optimization; Replication fault tolerance;
D O I
10.13328/j.cnki.jos.005372
中图分类号
学科分类号
摘要
It is important to provide efficient and continuously available fault tolerant services for cloud applications to ensure their reliable executions. This study adopts the fault tolerance as a service scheme to propose an optimized fault tolerance services provisioning method. The fault tolerance requirements for cloud applications are specified from certain aspects of cloud service components, such as reliability and response time. Based on major fault tolerance technologies, i.e., replication, checkpoint, and NVP (N-Version Programming), with consideration of the dynamic switching overhead among fault tolerance services, a novel method to compute optimal solution of feasible fault tolerance service provisioning is proposed according to the fault tolerance as a service scheme. Two analysis scenarios are considered, that is, whether cloud infrastructure resources used to support fault tolerance service are sufficient or not. The experimental results show that the proposed method reduces the fault tolerant service expenses for cloud application system, reduces the cost of cloud infrastructure resources supporting fault tolerance service, and improves the service capacity of fault tolerance service providers to provide efficient and reliable fault tolerance as a service for cloud application systems. © Copyright 2019, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:1191 / 1202
页数:11
相关论文
共 21 条
  • [1] Jhawar R., Piuri V., Fault tolerance and resilience in cloud computing environments, Computer and Information Security Handbook, pp. 125-141, (2013)
  • [2] Dai H.J., Zhao S.L., Zhang J.T., Qiu M.K., Tao L.X., Security enhancement of cloud servers with a redundancy-based fault-tolerant cache structure, Future Generation Computer Systems, 52, pp. 147-155, (2015)
  • [3] Wang J., Bao W.D., Zhu X.M., Yang L.T., Xiang Y., FESTAL: Fault-tolerant elastic scheduling algorithm for real-time tasks in virtualized clouds, IEEE Trans. on Computers, 64, 9, pp. 2545-2558, (2015)
  • [4] Jhawar R., Piuri V., Santambrogio M., Fault tolerance management in cloud computing: A system-level perspective, IEEE System Journal, 7, 2, pp. 288-297, (2013)
  • [5] Cheraghlou M.N., Khadem-Zadeh A., Haghparast M., A survey of fault tolerance architecture in cloud computing, Journal of Network and Computer Applications, 61, pp. 81-92, (2016)
  • [6] Sun D.W., Chang G.R., Miao C.S., Wang X.W., Analyzing, modeling and evaluating dynamic adaptive fault tolerance strategies in cloud computing environment, Journal of Super Computing, 66, 1, pp. 193-228, (2013)
  • [7] Yi H.Z., Wang F., Zuo K., Yang C.Q., Du Y.F., Ma Y.Q., Asynchronous checkpoint/restart based on memory buffer, Journal of Computer Research and Development, 51, 6, pp. 1229-1239, (2014)
  • [8] Gao Y., Gupta S.K., Wang Y.Z., Pedram M., An energy-aware fault tolerance scheduling framework for soft error resilient cloud computing systems, Proc. of the Design, Automation and Test in Europe Conference and Exhibition (DATE 2014), pp. 1-6, (2014)
  • [9] Hamid B., Radermacher A., Vanuxeem P., Lanusse A., Gerard S., A fault-tolerance framework for distributed component systems, Proc. of the 34th Euromicro Conf. Software Engineering and Advanced Applications (SEAA 2008), pp. 84-91, (2008)
  • [10] Nandi B.B., Paul H.S., Banerjee A., Ghosh S.C., Fault tolerance as a service, Proc. of the 6th IEEE Int'l Conf. on Cloud Computing (CLOUD 2013), pp. 446-453, (2013)