Continuous Cloud-Scale Query Optimization and Processing

被引:27
|
作者
Bruno, Nicolas [1 ]
Jain, Sapna [2 ]
Zhou, Jingren [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98008 USA
[2] Indian Inst Technol, Bombay, Maharashtra, India
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2013年 / 6卷 / 11期
关键词
D O I
10.14778/2536222.2536223
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Massive data analysis in cloud-scale data centers plays a crucial role in making critical business decisions. Highlevel scripting languages free developers from understanding various system trade-offs, but introduce new challenges for query optimization. One key optimization challenge is missing accurate data statistics, typically due to massive data volumes and their distributed nature, complex computation logic, and frequent usage of user-defined functions. In this paper we propose novel techniques to adapt query processing in the Scope system, the cloud-scale computation environment in Microsoft Online Services. We continuously monitor query execution, collect actual runtime statistics, and adapt parallel execution plans as the query executes. We discuss similarities and differences between our approach and alternatives proposed in the context of traditional centralized systems. Experiments on large-scale Scope production clusters show that the proposed techniques systematically solve the challenge of missing/inaccurate data statistics, detect and resolve partition skew and plan structure, and improve query latency by a few folds for real workloads. Although we focus on optimizing high-level languages, the same ideas are also applicable for MapReduce systems.
引用
收藏
页码:961 / 972
页数:12
相关论文
共 50 条
  • [31] New Syatems Opportunities in Cloud-Scale Data Center
    Chiueh, Tzi-Cker
    2016 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT), 2016,
  • [32] Gray Failure: The Achilles' Heel of Cloud-Scale Systems
    Huang, Peng
    Guo, Chuanxiong
    Zhou, Lidong
    Lorch, Jacob R.
    Dang, Yingnong
    Chintalapati, Murali
    Yao, Randolph
    PROCEEDINGS OF THE 16TH WORKSHOP ON HOT TOPICS IN OPERATING SYSTEMS (HOTOS 2017), 2017, : 150 - 155
  • [33] vCorfu: A Cloud-Scale Object Store on a Shared Log
    Wei, Michael
    Tai, Amy
    Rossbach, Christopher J.
    Abraham, Ittai
    Munshed, Maithem
    Dhawan, Medhavi
    Stabile, Jim
    Wieder, Udi
    Fritchie, Scott
    Swanson, Steven
    Freedman, Michael J.
    Malkhi, Dahlia
    PROCEEDINGS OF NSDI '17: 14TH USENIX SYMPOSIUM ON NETWORKED SYSTEMS DESIGN AND IMPLEMENTATION, 2017, : 35 - 49
  • [34] SCALE: An Efficient Framework for Secure Dynamic Skyline Query Processing in the Cloud
    Wang, Weiguo
    Li, Hui
    Peng, Yanguo
    Bhowmick, Sourav S.
    Chen, Peng
    Chen, Xiaofeng
    Cui, Jiangtao
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT III, 2020, 12114 : 288 - 305
  • [35] NetAlytics: Cloud-Scale Application Performance Monitoring with SDN and NFV
    Liu, Guyue
    Trotter, Michael
    Ren, Yuxin
    Wood, Timothy
    MIDDLEWARE '16: PROCEEDINGS OF THE 17TH INTERNATIONAL MIDDLEWARE CONFERENCE, 2016,
  • [36] A Novel Design of IoT Cloud Delegate Framework to Harmonize Cloud-Scale IoT Services
    Kum, Seung Woo
    Moon, JaeWon
    Lim, Taeboem
    Park, Jong Il
    2015 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2015, : 247 - 248
  • [37] SPECI, a Simulation Tool Exploring Cloud-Scale Data Centres
    Sriram, Ilango
    CLOUD COMPUTING, PROCEEDINGS, 2009, 5931 : 381 - 392
  • [38] Explicit cloud-scale models for operational forecasts: A note of caution
    Elmore, KL
    Stensrud, DJ
    Crawford, KC
    WEATHER AND FORECASTING, 2002, 17 (04) : 873 - 884
  • [39] Cloud-scale Molecular Gas Properties in 15 Nearby Galaxies
    Sun, Jiayi
    Leroy, Adam K.
    Schruba, Andreas
    Rosolowsky, Erik
    Hughes, Annie
    Kruijssen, J. M. Diederik
    Meidt, Sharon
    Schinnerer, Eva
    Blanc, Guillermo A.
    Bigiel, Frank
    Bolatto, Alberto D.
    Chevance, Melanie
    Groves, Brent
    Herrera, Cinthya N.
    Hygate, Alexander P. S.
    Pety, Jerome
    Querejeta, Miguel
    Usero, Antonio
    Utomo, Dyas
    ASTROPHYSICAL JOURNAL, 2018, 860 (02):
  • [40] URSA: Hybrid Block Storage for Cloud-Scale Virtual Disks
    Li, Huiba
    Zhang, Yiming
    Li, Dongsheng
    Zhang, Zhiming
    Liu, Shengyun
    Huang, Peng
    Qin, Zheng
    Chen, Kai
    Xiong, Yongqiang
    PROCEEDINGS OF THE FOURTEENTH EUROSYS CONFERENCE 2019 (EUROSYS '19), 2019,