Project Daytona: Data Analytics as a Cloud Service

被引:14
|
作者
Barga, Roger S. [1 ]
Ekanayake, Jaliya [1 ]
Lu, Wei [1 ]
机构
[1] Microsoft Corp, Microsoft Res, eXtreme Comp Grp, Redmond, WA 98053 USA
关键词
D O I
10.1109/ICDE.2012.136
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Spreadsheets are established data collection and analysis tools in business, technical computing and academic research. Excel, for example, offers an attractive user interface, provides an easy to use data entry model, and offers substantial interactivity for what-if analysis. However, spreadsheets and other common client applications do not offer scalable computation for large scale data analytics and exploration. Increasingly researchers in domains ranging from the social sciences to environmental sciences are faced with a deluge of data, often sitting in spreadsheets such as Excel or other client applications, and they lack a convenient way to explore the data, to find related data sets, or to invoke scalable analytical models over the data. To address these limitations, we have developed a cloud data analytics service based on Daytona, which is an iterative MapReduce runtime optimized for data analytics. In our model, Excel and other existing client applications provide the data entry and user interaction surfaces, Daytona provides a scalable runtime on the cloud for data analytics, and our service seamlessly bridges the gap between the client and cloud. Any analyst can use our data analytics service to discover and import data from the cloud, invoke cloud scale data analytics algorithms to extract information from large datasets, invoke data visualization, and then store the data back to the cloud all through a spreadsheet or other client application they are already familiar with.
引用
收藏
页码:1317 / 1320
页数:4
相关论文
共 50 条
  • [31] Scalable Progressive Analytics on Big Data in the Cloud
    Chandramouli, Badrish
    Goldstein, Jonathan
    Quamar, Abdul
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (14): : 1726 - 1737
  • [32] Big Data Analytics and Intelligence at Alibaba Cloud
    Zhou, Jingren
    ACM SIGPLAN NOTICES, 2017, 52 (04) : 1 - 1
  • [33] Data Analytics in the Cloud with Flexible MapReduce Workflows
    Goncalves, Carlos
    Assuncao, Luis
    Cunha, Jose C.
    2012 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2012,
  • [34] Moving Hadoop to the Cloud for Big Data Analytics
    Astrova, Irina
    Koschel, Arne
    Heine, Felix
    Kalja, Ahto
    DATABASES AND INFORMATION SYSTEMS X (DB&IS 2018), 2019, 315 : 195 - 209
  • [35] Autonomous Aggregate Data Analytics in Untrusted Cloud
    Mani, Ganapathy
    Ulybyshev, Denis
    Bhargava, Bharat
    Kobes, Jason
    Goyal, Puneet
    2018 IEEE FIRST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE), 2018, : 138 - 141
  • [36] Cloud Based Big Data Analytics A Review
    Manekar, Amitkumar
    Pradeepini, G.
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 785 - 788
  • [37] A Trusted Healthcare Data Analytics Cloud Platform
    Iyengar, Arun
    Kundu, Ashish
    Sharma, Upendra
    Zhang, Ping
    2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 1238 - 1249
  • [38] Big Data Analytics and Intelligence at Alibaba Cloud
    Zhou, Jingren
    TWENTY-SECOND INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXII), 2017, : 1 - 1
  • [39] Elastic Memory Management for Cloud Data Analytics
    Wang, Jingling
    Balazinska, Magdalena
    2017 USENIX ANNUAL TECHNICAL CONFERENCE (USENIX ATC '17), 2017, : 745 - 758
  • [40] Challenges of Cloud Computing & Big Data Analytics
    Gupta, Anita
    Mehrotra, Abhay
    Khan, P. M.
    2015 2ND INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2015, : 1112 - 1115