HBelt: Integrating an Incremental ETL Pipeline with a Big Data Store for Real-Time Analytics

被引:1
|
作者
Qu, Weiping [1 ]
Shankar, Sahana [1 ]
Ganza, Sandy [1 ]
Dessloch, Stefan [1 ]
机构
[1] Univ Kaiserslautern, Heterogeneous Informat Syst Grp, D-67663 Kaiserslautern, Germany
关键词
SYSTEM;
D O I
10.1007/978-3-319-23135-8_9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper demonstrates a system called HBelt which tightly integrates a distributed, key-value data store HBase with an extended ETL engine Kettle. The objective is to provide HBase tables with real-time data freshness in an efficient manner. A distributed ETL engine is extended and integrated as an overlay of HBase. Meanwhile, we extend this ETL engine with the capability of processing incremental ETL flows in a pipelined fashion. Delta batches are defined by the MVCC component in HBase to flush the incremental ETL pipeline for multiple concurrent read requests. Experimental results show that high query throughput can be achieved in HBelt for real-time analytics.
引用
收藏
页码:123 / 137
页数:15
相关论文
共 50 条
  • [41] Near real-time big data analytics for NFC-enabled logistics trajectories
    Karim, Lamia
    Boulmakoul, Azedine
    Lbath, Ahmed
    PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL CONFERENCE ON LOGISTICS OPERATIONS MANAGEMENT (GOL'16), 2016,
  • [42] Real-Time Big Data Analytics and Proactive Traffic Safety Management Visualization System
    Abdel-Aty, Mohamed
    Zheng, Ou
    Wu, Yina
    Abdelraouf, Amr
    Rim, Heesub
    Li, Pei
    JOURNAL OF TRANSPORTATION ENGINEERING PART A-SYSTEMS, 2023, 149 (08)
  • [43] Toward a smart health: big data analytics and IoT for real-time miscarriage prediction
    Asri, Hiba
    Jarir, Zahi
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [44] Real-Time Large-Scale Big Data Networks Analytics and Visualization Architecture
    Chopade, Pravin
    Zhan, Justin
    Roy, Kaushik
    Flurchick, Kenneth
    2015 12TH INTERNATIONAL CONFERENCE & EXPO ON EMERGING TECHNOLOGIES FOR A SMARTER WORLD (CEWIT), 2015,
  • [45] MOLESTRA: A Multi-Task Learning Approach for Real-Time Big Data Analytics
    Demertzis, Konstantinos
    Iliadis, Lazaros
    Anezakis, Vardis-Dimitris
    2018 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA), 2018,
  • [46] Toward a smart health: big data analytics and IoT for real-time miscarriage prediction
    Hiba Asri
    Zahi Jarir
    Journal of Big Data, 10
  • [47] Real-Time Tweet Analytics Using Hybrid Hashtags on Twitter Big Data Streams
    Gupta, Vibhuti
    Hewett, Rattikorn
    INFORMATION, 2020, 11 (07)
  • [48] Real-time misfire detection of large gas engine using big data analytics
    Szabo, Jozsef Z.
    Bakucz, Peter
    2018 IEEE 16TH INTERNATIONAL SYMPOSIUM ON INTELLIGENT SYSTEMS AND INFORMATICS (SISY 2018), 2018, : 215 - 220
  • [49] Process data store: A real-time data store for monitoring business processes
    Schiefer, J
    List, B
    Bruckner, RM
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2003, 2736 : 760 - 770
  • [50] Real-Time Data Analytics: An Algorithmic Perspective
    Morshed, Sarwar Jahan
    Rana, Juwel
    Milrad, Marcelo
    DATA MINING AND BIG DATA, DMBD 2016, 2016, 9714 : 311 - 320