Efficient incremental loading in ETL processing for real-time data integration

被引:0
|
作者
Neepa Biswas
Anamitra Sarkar
Kartick Chandra Mondal
机构
[1] Jadavpur University,Department of Information Technology
关键词
Data warehouse; Code-based ETL; ETL tools; Pygrametl; Petl; Scriptella; Incremental load; Bulk load; CDC;
D O I
暂无
中图分类号
学科分类号
摘要
ETL (extract transform load) is the widely used standard process for creating and maintaining a data warehouse (DW). ETL is the most resource-, cost- and time-demanding process in DW implementation and maintenance. Nowadays, many graphical user interfaces (GUI)-based solutions are available to facilitate the ETL processes. In spite of the high popularity of GUI-based tool, there is still some downside of such approach. This paper focuses on alternative ETL developmental approach taken by hand coding. In some contexts like research and academic work, it is appropriate to go for custom-coded solution which can be cheaper, faster and maintainable compared to any GUI-based tools. Some well-known code-based open-source ETL tools developed by the academic world have been studied in this article. Their architecture and implementation details are addressed here. The aim of this paper is to present a comparative evaluation of these code-based ETL tools. Finally, an efficient ETL model is designed to meet the near real-time responsibility of the present days.
引用
收藏
页码:53 / 61
页数:8
相关论文
共 50 条
  • [41] Service oriented architecture for the integration of clinical and physiological data for real-time event stream processing
    Kamaleswaran, Rishikesan
    McGregor, Carolyn
    Percival, Jennifer
    2009 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-20, 2009, : 1667 - +
  • [42] ALGORITHM OF AIRCRAFT FLIGHT DATA PROCESSING IN REAL-TIME
    Sabziev, Elkhan
    SCIENTIFIC JOURNAL OF SILESIAN UNIVERSITY OF TECHNOLOGY-SERIES TRANSPORT, 2020, 108 : 213 - 221
  • [43] Adaptive Data Processing for Real-Time Nutrition Monitoring
    Hosseini, Anahita
    Kalantarian, Haik
    Sarrafzadeh, Majid
    2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 1882 - 1885
  • [44] Real-Time Edge Processing During Data Acquisition
    Rietmann, Max
    Nakshatrala, Praveen
    Lefman, Jonathan
    Gupta, Geetika
    ACCELERATING SCIENCE AND ENGINEERING DISCOVERIES THROUGH INTEGRATED RESEARCH INFRASTRUCTURE FOR EXPERIMENT, BIG DATA, MODELING AND SIMULATION, SMC 202, 2022, 1690 : 191 - 205
  • [45] Real-Time Processing and Quality Improvement of Synchrophasor Data
    Pourramezan, Reza
    Karimi, Houshang
    Mahseredjian, Jean
    Paolone, Mario
    2021 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2021,
  • [46] Data Processing for Real-Time Wireless Passive Radar
    Chetty, Kevin
    Tan, Bo
    Woodbridge, Karl
    2014 IEEE RADAR CONFERENCE, 2014, : 455 - 459
  • [47] Review of real-time data processing for collider experiments
    Gligorov, V. V.
    Rekovic, V.
    EUROPEAN PHYSICAL JOURNAL PLUS, 2023, 138 (11):
  • [48] REAL-TIME OPTICAL DATA-PROCESSING DEVICE
    JACOBSON, A
    GRINBERG, J
    BLEHA, W
    MILLER, L
    FRAAS, L
    MYER, G
    BOSWELL, D
    INFORMATION DISPLAY, 1975, 12 (01) : 17 - 22
  • [49] Big Data Real-time Processing Based on Storm
    Yang, Wenjie
    Liu, Xingang
    Zhang, Lan
    Yang, Laurence T.
    2013 12TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2013), 2013, : 1784 - 1787
  • [50] Real-time incremental recommendation for streaming data based on apache flink
    Tang, Zhuo
    Liu, Zeyu
    Li, Kenli
    Li, Keqin
    INTELLIGENT DATA ANALYSIS, 2019, 23 (06) : 1421 - 1437