Compliant Geo-distributed Data Processing in Action

被引:3
|
作者
Beedkar, Kaustubh [1 ]
Brekardin, David [1 ]
Quiane-Ruiz, Jorge-Anulfo [1 ,2 ]
Markl, Volker [1 ,2 ]
机构
[1] TU Berlin, Berlin, Germany
[2] DFKI, Kaiserslautern, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2021年 / 14卷 / 12期
关键词
D O I
10.14778/3476311.3476359
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we present our work on compliant geo distributed data processing. Our work focuses on the new dimension of dataflow constraints that regulate the movement of data across geographical or institutional borders. For example, European directives may regulate transferring only certain information fields (such as non personal information) or aggregated data. Thus, it is crucial for distributed data processing frameworks to consider compliance with respect to dataflow constraints derived from these regulations. We have developed a compliance-based data processing framework, which (i) allows for the declarative specification of dataflow constraints, (ii) determines if a query can be translated into a compliant distributed query execution plan, and (iii) executes the compliant plan over distributed SQL databases. We demonstrate our framework using a geo-distributed adaptation of the TPC-H benchmark data. Our framework provides an interactive dashboard, which allows users to specify dataflow constraints, and analyze and execute compliant distributed query execution plans.
引用
收藏
页码:2843 / 2846
页数:4
相关论文
共 50 条
  • [31] Fault-tolerant scheduling and data placement for scientific workflow processing in geo-distributed clouds
    Li, Chunlin
    Liu, Jun
    Wang, Min
    Luo, Youlong
    JOURNAL OF SYSTEMS AND SOFTWARE, 2022, 187
  • [32] Data Centers Selection for Moving Geo-distributed Big Data to Cloud
    Zhang, Jiangtao
    Yuan, Qiang
    Chen, Shi
    Huang, Hejiao
    Wang, Xuan
    JOURNAL OF INTERNET TECHNOLOGY, 2019, 20 (01): : 111 - 122
  • [33] Temperature Aware Workload Management in Geo-Distributed Data Centers
    Xu, Hong
    Feng, Chen
    Li, Baochun
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (06) : 1743 - 1753
  • [34] Optimal Query Plans for Geo-distributed Data Analytics at Scale
    Pradhan, Ahana
    Karthik, Srinivas
    Subramanya, Raghunandan
    PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 247 - 251
  • [35] MapReduce Task Scheduling in Heterogeneous Geo-Distributed Data Centers
    Li, Xiaoping
    Chen, Fuchao
    Ruiz, Ruben
    Zhu, Jie
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (06) : 3317 - 3329
  • [36] Joint Scheduling of Data and Computation in Geo-distributed Cloud Systems
    Yin, Lingyan
    Sun, Jizhou
    Zhao, Laiping
    Cui, Chenzhou
    Xiao, Jian
    Yu, Ce
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 657 - 666
  • [37] Samya: A Geo-Distributed Data System for High Contention Aggregate Data
    Maiyya, Sujaya
    Ahmad, Ishtiyaque
    Agrawal, Divyakant
    El Abbadi, Amr
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 1440 - 1451
  • [38] Optimizing Windowed Aggregation over Geo-Distributed Data Streams
    Sajjad, Hooman Peiro
    Vlassov, Vladimir
    Liu, Ying
    2018 IEEE INTERNATIONAL CONFERENCE ON EDGE COMPUTING (IEEE EDGE), 2018, : 33 - 41
  • [39] A MapReduce Cluster Deployment Optimization Framework with Geo-distributed Data
    Li, Shanshan
    Lu, Qinghua
    Zhang, Weishan
    Zhu, Liming
    IEEE 12TH INT CONF UBIQUITOUS INTELLIGENCE & COMP/IEEE 12TH INT CONF ADV & TRUSTED COMP/IEEE 15TH INT CONF SCALABLE COMP & COMMUN/IEEE INT CONF CLOUD & BIG DATA COMP/IEEE INT CONF INTERNET PEOPLE AND ASSOCIATED SYMPOSIA/WORKSHOPS, 2015, : 943 - 949
  • [40] Fast, scalable and geo-distributed PCA for big data analytics
    Adnan, T. M. Tariq
    Tanjim, Md Mehrab
    Adnan, Muhammad Abdullah
    INFORMATION SYSTEMS, 2021, 98 (98)