Microlearner: A fine-grained Learning Optimizer for Big Data Workloads at Microsoft

被引:10
|
作者
Jindal, Alekh [1 ]
Qiao, Shi [2 ]
Sen, Rathijit [1 ]
Patel, Hiren [2 ]
机构
[1] Microsoft Corp, Gray Syst Lab, Redmond, WA 98052 USA
[2] Microsoft Corp, Azure Data, Redmond, WA 98052 USA
关键词
D O I
10.1109/ICDE51399.2021.00275
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big data systems have become increasingly complex making the job of a query optimizer incredibly difficult. This is due to more complicated decision making, more complex query plans seen, and more tedious objective functions in cloud-based big data workloads. As a result, production cloud query optimizers are often far from optimal. In this paper, we describe building a learning query optimizer for big data workloads at Microsoft. We make four major contributions. First, we describe the challenges in cloud query optimizers based on our observations from the big data workloads at Microsoft. Second, we discuss what makes machine learning an attractive approach to aid the big data query optimizers in decision making. Third, we present Microlearner, a practical approach to characterize large cloud workloads into smaller subsets and build micromodels over each subset to tame the complexity of big data workloads And finally, we describe the productization of Microlearner, using learned cardinality as a concrete example, via performance results over very large production workloads and illustrating the various challenges involved in deployment.
引用
收藏
页码:2423 / 2434
页数:12
相关论文
共 50 条
  • [1] Fine-Grained Accelerators for Sparse Machine Learning Workloads
    Mishra, Asit K.
    Nurvitadhi, Eriko
    Venkatesh, Ganesh
    Pearce, Jonathan
    Marr, Debbie
    2017 22ND ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2017, : 635 - 640
  • [2] WattWatcher: Fine-Grained Power Estimation For Emerging Workloads
    LeBeane, Michael
    Ryoo, Jee Ho
    Panda, Reena
    John, Lizy K.
    2015 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2015, : 106 - 113
  • [3] Forecasting Fine-Grained Air Quality Based on Big Data
    Zheng, Yu
    Yi, Xiuwen
    Li, Ming
    Li, Ruiyuan
    Shan, Zhangqing
    Chang, Eric
    Li, Tianrui
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 2267 - 2276
  • [4] Towards Fine-Grained Dataflow Parallelism in Big Data Systems
    Ertel, Sebastian
    Adam, Justus
    Castrillon, Jeronimo
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2017, 2019, 11403 : 281 - 282
  • [5] Improve Fine-Grained Feature Learning in Fine-Grained DataSet GAI
    Wang, Hai Peng
    Geng, Zhi Qing
    IEEE ACCESS, 2025, 13 : 12777 - 12788
  • [6] A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning
    Lyu, Chenghao
    Fan, Qi
    Guyard, Philippe
    Diao, Yanlei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (11): : 3565 - 3579
  • [7] Fine-Grained Dynamic Resource Allocation for Big-Data Applications
    Baresi, Luciano
    Leva, Alberto
    Quattrocchi, Giovanni
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2021, 47 (08) : 1668 - 1682
  • [8] Construct Fine-grained Energy Big Data Using NILM Technology
    Liu, Yan
    Yuan, Ruiming
    Yang, Xiaokun
    Liu, Bo
    Zhang, Ruiqi
    2021 3RD ASIA ENERGY AND ELECTRICAL ENGINEERING SYMPOSIUM (AEEES 2021), 2021, : 1160 - 1164
  • [9] A Fine-Grained Distribution Approach for ETL Processes in Big Data Environments
    Bala, Mahfoud
    Boussaid, Omar
    Alimazighi, Zaia
    DATA & KNOWLEDGE ENGINEERING, 2017, 111 : 114 - 136
  • [10] tf-Darshan: Understanding Fine-grained I/O Performance in Machine Learning Workloads
    Chien, Steven W. D.
    Podobas, Artur
    Peng, Ivy B.
    Markidis, Stefano
    2020 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2020), 2020, : 359 - 370