Apache Wayang: A Unified Data Analytics Framework

被引:2
|
作者
Beedkar, Kaustubh [1 ,4 ]
Contreras-Rojas, Bertty [2 ]
Gavriilidis, Haralampos [2 ]
Kaoudi, Zoi [3 ,4 ]
Markl, Volker [2 ]
Pardo-Meza, Rodrigo [2 ]
Quiane-Ruiz, Jorge-Arnulfo [3 ,4 ]
机构
[1] Indian Inst Technol Delhi, New Delhi, India
[2] Tech Univ Berlin, Berlin, Germany
[3] IT Univ Copenhagen, Copenhagen, Denmark
[4] Databloom Inc, Miami, FL 33127 USA
关键词
16;
D O I
10.1145/3631504.3631510
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The large variety of specialized data processing platforms and the increased complexity of data analytics has led to the need for unifying data analytics within a single framework. Such a framework should free users from the burden of (i) choosing the right platform(s) and (ii) gluing code between the different parts of their pipelines. Apache Wayang (Incubating) is the only open-source framework that provides a systematic solution to unified data analytics by integrating multiple heterogeneous data processing platforms. It achieves that by decoupling applications from the underlying platforms and providing an optimizer so that users do not have to specify the platforms on which their pipeline should run. Wayang provides a unified view and processing model, effectively integrating the hodgepodge of heterogeneous platforms into a single framework with increased usability without sacrificing performance and total cost of ownership. In this paper, we present the architecture ofWayang, describe its main components, and give an outlook on future directions.
引用
收藏
页码:30 / 35
页数:6
相关论文
共 50 条
  • [1] Unified Framework For Clinical Data Analytics (U-CDA)
    Gholap, Jay
    Janeja, Vandana P.
    Yesha, Yelena
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2939 - 2941
  • [2] Big data analytics on Apache Spark
    Salloum S.
    Dautov R.
    Chen X.
    Peng P.X.
    Huang J.Z.
    International Journal of Data Science and Analytics, 2016, 1 (3-4) : 145 - 164
  • [3] Unified Programming Model and Software Framework for Big Data Machine Learning and Data Analytics
    Gu, Rong
    Tang, Yun
    Dong, Qianhao
    Wang, Zhaokang
    Liu, Zhiqiang
    Wang, Shuai
    Yuan, Chunfeng
    Huang, Yihua
    IEEE 39TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS (COMPSAC 2015), VOL 3, 2015, : 562 - 567
  • [4] Big Data Software Analytics with Apache Spark
    Gousios, Georgios
    PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING - COMPANION (ICSE-COMPANION, 2018, : 542 - 543
  • [5] Efficient Incremental Data Analytics with Apache Spark
    Gholamian, Sina
    Golab, Wojciech
    Ward, Paul A. S.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2859 - 2868
  • [6] Shared Disk Big Data Analytics with Apache Hadoop
    Mukherjee, Anirban
    Datta, Joydip
    Jorapur, Raghavendra
    Singhvi, Ravi
    Haloi, Saurav
    Akram, Wasim
    2012 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2012,
  • [7] Is Apache Spark Scalable to Seismic Data Analytics and Computations?
    Yan, Yuzhong
    Huang, Lei
    Yi, Liqi
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2036 - 2045
  • [8] Apache Spark: A Unified Engine for Big Data Processing
    Zaharia, Matei
    Xin, Reynold S.
    Wendell, Patrick
    Das, Tathagata
    Armbrust, Michael
    Dave, Ankur
    Meng, Xiangrui
    Rosen, Josh
    Venkataraman, Shivaram
    Franklin, Michael J.
    Ghodsi, Ali
    Gonzalez, Joseph
    Shenker, Scott
    Stoica, Ion
    COMMUNICATIONS OF THE ACM, 2016, 59 (11) : 56 - 65
  • [9] Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm
    Gupta, Vibhuti
    Hewett, Rattikorn
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 4554 - 4558
  • [10] The Analytics of SVARs: A Unified Framework to Measure Fiscal Multipliers
    Caldara, Dario
    Kamps, Christophe
    REVIEW OF ECONOMIC STUDIES, 2017, 84 (03): : 1015 - 1040