Scalable and Efficient Data Analytics and Mining with Lemonade

被引:5
|
作者
dos Santos, Walter [1 ]
Avelar, Gustavo P. [1 ]
Ribeiro, Manoel Horta [1 ]
Guedes, Dorgival [1 ]
Meira Jr, Wagner [1 ]
机构
[1] Univ Fed Minas Gerais, Dept Ciencia Comp, Belo Horizonte, MG, Brazil
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2018年 / 11卷 / 12期
关键词
D O I
10.14778/3229863.3236262
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Professionals outside of the area of Computer Science have an increasing need to analyze large bodies of data. This analysis often demands high level of security and has to be done in the cloud. However, current data analysis tools that demand little proficiency in systems programming struggle to deliver solutions which are scalable and safe. In this context we present Lemonade, a platform which focuses on creating data analysis and mining flows in the cloud, with authentication, authorization and accounting (AAA) guarantees. Lemonade provides an interface for the visual construction of flows, and encapsulates storage and data processing environment details, providing higher-level abstractions for data source access and algorithms. We illustrate its usage through a demo, where a data processing flow builds a classification model for detecting fake-news, also extracting some insights along the way.
引用
收藏
页码:2070 / 2073
页数:4
相关论文
共 50 条
  • [1] Lemonade: A scalable and efficient Spark-based platform for data analytics
    dos Santos, Walter
    Carvalho, Luiz F. M.
    Avelar, Gustavo de P.
    Silva, Atila, Jr.
    2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 745 - 748
  • [2] Scalable Vertical Mining for Big Data Analytics of Frequent Itemsets
    Leung, Carson K.
    Zhang, Hao
    Souza, Joglas
    Lee, Wookey
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2018, PT I, 2018, 11029 : 3 - 17
  • [3] A Scalable Data Analytics Algorithm for Mining Frequent Patterns from Uncertain Data
    MacKinnon, Richard Kyle
    Leung, Carson Kai-Sang
    Tanbeer, Syed K.
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2014, 8643 : 404 - 416
  • [4] A communication efficient and scalable distributed data mining for the astronomical data
    Govada, A.
    Sahay, S. K.
    ASTRONOMY AND COMPUTING, 2016, 16 : 166 - 173
  • [5] Towards Intelligent Distributed Data Systems for Scalable Efficient and Accurate Analytics
    Triantafillou, Peter
    2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 1192 - 1202
  • [6] A Scalable Data Structure for Efficient Graph Analytics and In-Place Mutations
    Firmli, Soukaina
    Chiadmi, Dalila
    DATA, 2023, 8 (11)
  • [7] Scalable Analytics on Fast Data
    Kipf, Andreas
    Pandey, Varun
    Boettcher, Jan
    Braun, Lucas
    Neumann, Thomas
    Kemper, Alfons
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2019, 44 (01):
  • [8] An Efficient Framework of Data Mining and its Analytics on Massive Streams of Big Data Repositories
    Disha, D. N.
    Sowmya, B. J.
    Chetan
    Seema, S.
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING, VLSI, ELECTRICAL CIRCUITS AND ROBOTICS (DISCOVER), 2016, : 195 - 200
  • [9] Scalable Data-Intensive Analytics
    Hsu, Meichun
    Chen, Qiming
    BUSINESS INTELLIGENCE FOR THE REAL-TIME ENTERPRISE, 2009, 27 : 97 - +
  • [10] Big Data Framework for Scalable and Efficient Biomedical Literature Mining in the Cloud
    Shen, Zhengru
    Wang, Xi
    Spruit, Marco
    NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 80 - 86