Logical big data integration and near real-time data analytics

被引:6
|
作者
Silva, Bruno [1 ]
Moreira, Jose [1 ,2 ]
Costa, Rogerio Luis de C. [3 ]
机构
[1] Univ Aveiro, Inst Elect & Informat Engn IEETA, LASI, P-3810193 Aveiro, Portugal
[2] Univ Aveiro, Dept Elect Telecommun & Informat DETI, P-3810193 Aveiro, Portugal
[3] Polytech Leiria, Comp Sci & Commun Res Ctr CIIC, P-2411901 Leiria, Portugal
关键词
Big data integration; Distributed databases; Near real-time OLAP;
D O I
10.1016/j.datak.2023.102185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of decision-making, there is a growing demand for near real-time data that traditional solutions, like data warehousing based on long-running ETL processes, cannot fully meet. On the other hand, existing logical data integration solutions are challenging because users must focus on data location and distribution details rather than on data analytics and decision-making. EasyBDI is an open-source system that provides logical integration of data and high-level business-oriented abstractions. It uses schema matching, integration, and mapping techniques, to automatically identify partitioned data and propose a global schema. Users can then specify star schemas based on global entities and submit analytical queries to retrieve data from distributed data sources without knowing the organization and other technical details of the underlying systems. This work presents the algorithms and methods for global schema creation and query execution. Experimental results show that the overhead imposed by logical integration layers is relatively small compared to the execution times of distributed queries.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Near real-time streaming analysis of big fusion data
    Kube, R.
    Churchill, R. M.
    Chang, C. S.
    Choi, J.
    Wang, R.
    Klasky, S.
    Stephey, L.
    Dart, E.
    Choi, M. J.
    PLASMA PHYSICS AND CONTROLLED FUSION, 2022, 64 (03)
  • [22] GPGPU for Real-Time Data Analytics
    He, Bingsheng
    Huynh Phung Huynh
    Mong, Rick Goh Siow
    PROCEEDINGS OF THE 2012 IEEE 18TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2012), 2012, : 945 - +
  • [23] Data Systems Fault Coping for Real-time Big Data Analytics Required Architectural Crucibles
    Cohen, Stephen
    Money, William
    PROCEEDINGS OF THE 50TH ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, 2017, : 1023 - 1032
  • [24] Big Data Real-Time Clickstream Data Ingestion Paradigm for E-Commerce Analytics
    Pal, Gautam
    Li, Gangmin
    Atkinson, Katie
    2018 4TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
  • [25] Mapping the Big Data Landscape: Technologies, Platforms and Paradigms for Real-Time Analytics of Data Streams
    Dubuc, Timothee
    Stahl, Frederic
    Roesch, Etienne B.
    IEEE ACCESS, 2021, 9 : 15351 - 15374
  • [26] Real-Time Data ETL Framework for Big Real-Time Data Analysis
    Li, Xiaofang
    Mao, Yingchi
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 1289 - 1294
  • [27] Using a Rich Context Model for Real-Time Big Data Analytics in Twitter
    Sotsenko, Alisa
    Jansen, Marc
    Milrad, Marcelo
    Rana, Juwel
    2016 IEEE 4TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD WORKSHOPS (FICLOUDW), 2016, : 228 - 233
  • [28] Real-time QoS Monitoring for Big Data Analytics in Mobile Environment: an Overview
    Xiao, Fang
    Wainaina, Paul
    2016 INTERNATIONAL CONGRESS ON COMPUTATION ALGORITHMS IN ENGINEERING (ICCAE 2016), 2016, : 26 - 30
  • [29] Using Big Data and Real-Time Analytics to Support Smart City Initiatives
    Souza, Arthur
    Figueredo, Mickael
    Cacho, Nelio
    Araujo, Daniel
    Prolo, Carlos A.
    IFAC PAPERSONLINE, 2016, 49 (30): : 257 - 262
  • [30] Real-time big data analytics for hard disk drive predictive maintenance
    Su, Chuan-Jun
    Huang, Shi-Feng
    COMPUTERS & ELECTRICAL ENGINEERING, 2018, 71 : 93 - 101