Distributed stream processing analysis in high availability context

被引:0
|
作者
Gorawski, Marcin [1 ]
Marks, Pawel [1 ]
机构
[1] Silesian Tech Univ, Inst Comp Sci, Akademicka 16, PL-44100 Gliwice, Poland
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Not so long ago data warehouses were used to process data sets loaded periodically during ETL process (Extraction, Transformation and Loading). We could distinguish two kinds of ETL processes: full and incremental. Now we often have to process real-time data and analyse them almost on-the-fly, so the analyses are always up to date. There are many possible applications for real-time data warehouses. In most cases two features are important: delivering data to the warehouse as quick as possible, and not losing any tuple in case of failures. In this paper we propose an architecture for gathering and processing data from geographically distributed data sources. We present theoretical analysis, mathematical model of a data source, some rules of system modules configuration and results of experiments. At the end of the paper our future plans are described briefly.
引用
收藏
页码:61 / +
页数:2
相关论文
共 50 条
  • [21] Task Allocation for Distributed Stream Processing
    Eidenbenz, Raphael
    Locher, Thomas
    IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [22] Elastic Stream Processing for Distributed Environments
    Hochreiner, Christoph
    Schulte, Stefan
    Dustdar, Schahram
    Lecue, Freddy
    IEEE INTERNET COMPUTING, 2015, 19 (06) : 54 - 59
  • [23] Distributed Data Stream Processing with Onix
    Shtykh, Roman Y.
    Suzuki, Toshihiro
    2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 267 - 268
  • [24] Load distribution for distributed stream processing
    Xing, Ying
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3268 : 112 - 120
  • [25] Signal processing challenges in distributed stream processing systems
    Frossard, Pascal
    Verscheure, Olivier
    Venkatramani, Chitra
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 5903 - 5906
  • [26] A Performance Benchmark for NetFlow Data Analysis on Distributed Stream Processing Systems
    Cermak, Milan
    Tovarnak, Daniel
    Lastovicka, Martin
    Celeda, Pavel
    NOMS 2016 - 2016 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, 2016, : 919 - 924
  • [27] Towards automated analysis of connections network in distributed stream processing system
    Gorawski, Marcin
    Marks, Pawel
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2008, 4947 : 670 - 677
  • [28] Exploitation of Backup Nodes for Reducing Recovery Cost in High Availability Stream Processing Systems
    Nagano, Kyoko
    Itokawa, Tsuyoshi
    Kitasuka, Teruaki
    Aritsugi, Masayoshi
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM (IDEAS '10), 2010, : 61 - 63
  • [29] Modeling Data Stream Intensity in Distributed Stream Processing System
    Gorawski, Marcin
    Marks, Pawel
    Gorawski, Michal
    COMPUTER NETWORKS, CN 2013, 2013, 370 : 372 - 383
  • [30] Efficient and Adaptive Stateful Replication for Stream Processing Engines in High-Availability Cluster
    Feng, Yi-Hsuan
    Huang, Nen-Fu
    Wu, Yen-Min
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2011, 22 (11) : 1788 - 1796