A Hybrid Approach to High Availability in Stream Processing Systems

被引:36
|
作者
Zhang, Zhe [1 ]
Gu, Yu [2 ]
Ye, Fan [3 ]
Yang, Hao [4 ]
Kim, Minkyong [3 ]
Lei, Hui [3 ]
Liu, Zhen [4 ]
机构
[1] Oak Ridge Natl Lab, Natl Ctr Computat Sci, Oak Ridge, TN 37831 USA
[2] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
[3] IBM T J Watson Res Ctr, Hawthorne, NY USA
[4] Nokia Res Ctr, White Plains, NY USA
关键词
ALGORITHMS;
D O I
10.1109/ICDCS.2010.81
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Stream processing is widely used by today's applications such as financial data analysis and disaster response. In distributed stream processing systems, machine fail-stop events are handled by either active standby or passive standby. However, existing high availability (HA) schemes have not sufficiently addressed the situation when a machine becomes temporarily unavailable due to data rate spikes, intensive analysis or job sharing, which happens frequently but lasts for short time. It is not clear how well active and passive standby fare against such transient unavailability. In this paper, we first critically examine the suitability of active and passive standby against transient unavailability in a real testbed environment. We find that both approaches have advantages and drawbacks, but neither is ideal to provide fast recovery at low overhead as required to handle transient unavailability. Based on the insights gained, we propose a novel hybrid HA method that switches between active and passive standby modes depending on the occurrence of failure events. It presents a desirable tradeoff that is different from existing HA approaches: low overhead during normal conditions and fast recovery upon transient or permanent failure events. We have implemented our hybrid method and compared it with existing HA designs with comprehensive evaluation. The results show that our hybrid method can reduce two-thirds of the recovery time compared to passive standby and 80% message overhead compared to active standby, allowing applications to enjoy uninterrupted processing without paying a high premium.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Recovery Processing for High Availability Stream Processing Systems in Local Area Networks
    Aritsugi, Masayoshi
    Nagano, Kyoko
    TENCON 2010: 2010 IEEE REGION 10 CONFERENCE, 2010, : 1036 - 1041
  • [2] Exploitation of Backup Nodes for Reducing Recovery Cost in High Availability Stream Processing Systems
    Nagano, Kyoko
    Itokawa, Tsuyoshi
    Kitasuka, Teruaki
    Aritsugi, Masayoshi
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM (IDEAS '10), 2010, : 61 - 63
  • [3] High-availability algorithms for distributed stream processing
    Hwang, JH
    Balazinska, M
    Rasin, A
    Çetintemel, U
    Stonebraker, M
    Zdonik, S
    ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 779 - 790
  • [4] Distributed stream processing analysis in high availability context
    Gorawski, Marcin
    Marks, Pawel
    ARES 2007: SECOND INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY, PROCEEDINGS, 2007, : 61 - +
  • [5] Load management and high availability in the Borealis distributed stream processing engine
    Tatbul, Nesime
    Ahmad, Yanif
    Cetintemel, Ugur
    Hwang, Jeong-Hyon
    Xing, Ying
    Zdonik, Stan
    GEOSENSOR NETWORKS, 2008, 4540 : 66 - +
  • [6] Stream Processing on Hybrid CPU/Intel® Xeon Phi™ Systems
    Ferrao, Paulo
    Marques, Helder
    Paulino, Herve
    EURO-PAR 2018: PARALLEL PROCESSING, 2018, 11014 : 796 - 810
  • [7] A cooperative, self-configuring high-availability solution for stream processing
    Hwang, Jeong-Hyon
    Xing, Ying
    Cetintemel, Ugur
    Zdonik, Stan
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 151 - +
  • [8] Soft Quorums: A High Availability Solution for Service Oriented Stream Systems
    Song, Chunyao
    Ge, Tingjian
    Chen, Cindy
    Wang, Jie
    2015 IEEE 35th International Conference on Distributed Computing Systems, 2015, : 778 - 779
  • [9] Soft Quorums: A High Availability Solution for Service Oriented Stream Systems
    Song, Chunyao
    Ge, Tingjian
    Chen, Cindy
    Wang, Jie
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017), PT II, 2017, 10178 : 253 - 268
  • [10] A hybrid distributed batch-stream processing approach for anomaly detection
    Pishgoo, Boshra
    Azirani, Ahmad Akbari
    Raahemi, Bijan
    INFORMATION SCIENCES, 2021, 543 : 309 - 327