Apache Wayang: A Unified Data Analytics Framework

被引:2
|
作者
Beedkar, Kaustubh [1 ,4 ]
Contreras-Rojas, Bertty [2 ]
Gavriilidis, Haralampos [2 ]
Kaoudi, Zoi [3 ,4 ]
Markl, Volker [2 ]
Pardo-Meza, Rodrigo [2 ]
Quiane-Ruiz, Jorge-Arnulfo [3 ,4 ]
机构
[1] Indian Inst Technol Delhi, New Delhi, India
[2] Tech Univ Berlin, Berlin, Germany
[3] IT Univ Copenhagen, Copenhagen, Denmark
[4] Databloom Inc, Miami, FL 33127 USA
关键词
16;
D O I
10.1145/3631504.3631510
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The large variety of specialized data processing platforms and the increased complexity of data analytics has led to the need for unifying data analytics within a single framework. Such a framework should free users from the burden of (i) choosing the right platform(s) and (ii) gluing code between the different parts of their pipelines. Apache Wayang (Incubating) is the only open-source framework that provides a systematic solution to unified data analytics by integrating multiple heterogeneous data processing platforms. It achieves that by decoupling applications from the underlying platforms and providing an optimizer so that users do not have to specify the platforms on which their pipeline should run. Wayang provides a unified view and processing model, effectively integrating the hodgepodge of heterogeneous platforms into a single framework with increased usability without sacrificing performance and total cost of ownership. In this paper, we present the architecture ofWayang, describe its main components, and give an outlook on future directions.
引用
收藏
页码:30 / 35
页数:6
相关论文
共 50 条
  • [31] Analytics Canvas - A Framework for the Design and Specification of Data Analytics Projects
    Kuehn, Arno
    Joppen, Robert
    Reinhart, Felix
    Roeltgen, Daniel
    von Enzberg, Sebastian
    Dumitrescu, Roman
    28TH CIRP DESIGN CONFERENCE 2018, 2018, 70 : 162 - 167
  • [32] Apache Nemo: A Framework for Optimizing Distributed Data Processing
    Song, Won Wook
    Yang, Youngseok
    Eo, Jeongyoon
    Seo, Jangho
    Kim, Joo Yeon
    Lee, Sanha
    Lee, Gyewon
    Um, Taegeon
    Cho, Haeyoon
    Chun, Byung-Gon
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2021, 38 (3-4):
  • [33] HDSAnalytics: A Data Analytics Framework for Heterogeneous Data Sources
    Jaybal, Yogalakshmi
    Ramanathan, Chandrashekar
    Rajagopalan, S.
    PROCEEDINGS OF THE ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA (CODS-COMAD'18), 2018, : 11 - 19
  • [34] A big data analytics framework for scientific data management
    Fiore, Sandro
    Palazzo, Cosimo
    D'Anca, Alessandro
    Foster, Ian
    Williams, Dean N.
    Aloisio, Giovanni
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [35] Big Data Analytics Framework for Predictive Analytics using Public Data with Privacy Preserving
    Ho, Duy H.
    Lee, Yugyung
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5395 - 5405
  • [36] FedFPM: A Unified Federated Analytics Framework for Collaborative Frequent Pattern Mining
    Wang, Zibo
    Zhu, Yifei
    Wang, Dan
    Han, Zhu
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 61 - 70
  • [37] A Unified Framework for Decision-Making Process on Social Media Analytics
    Misirlis, Nikolaos
    Vlachopoulou, Maro
    OPERATIONAL RESEARCH IN THE DIGITAL ERA - ICT CHALLENGES, 2019, : 147 - 159
  • [38] Unified Structured Framework for mHealth Analytics: Building an Open and Collaborative Community
    Nguyen, Hoang D.
    Poo, Danny Chiang Choon
    SOCIAL COMPUTING AND SOCIAL MEDIA: APPLICATIONS AND ANALYTICS, SCSM 2017, PT II, 2017, 10283 : 440 - 450
  • [39] Performance Comparison Between Apache Hive and Oracle SQL for Big Data Analytics
    Sethy, Rotsnarani
    Dash, Santosh Kumar
    Panda, Mrutyunjaya
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR 2016), 2018, 614 : 130 - 141
  • [40] Predictors of outpatients' no-show: big data analytics using apache spark
    Daghistani, Tahani
    AlGhamdi, Huda
    Alshammari, Riyad
    AlHazme, Raed H.
    JOURNAL OF BIG DATA, 2020, 7 (01)