Microlearner: A fine-grained Learning Optimizer for Big Data Workloads at Microsoft

被引:10
|
作者
Jindal, Alekh [1 ]
Qiao, Shi [2 ]
Sen, Rathijit [1 ]
Patel, Hiren [2 ]
机构
[1] Microsoft Corp, Gray Syst Lab, Redmond, WA 98052 USA
[2] Microsoft Corp, Azure Data, Redmond, WA 98052 USA
关键词
D O I
10.1109/ICDE51399.2021.00275
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big data systems have become increasingly complex making the job of a query optimizer incredibly difficult. This is due to more complicated decision making, more complex query plans seen, and more tedious objective functions in cloud-based big data workloads. As a result, production cloud query optimizers are often far from optimal. In this paper, we describe building a learning query optimizer for big data workloads at Microsoft. We make four major contributions. First, we describe the challenges in cloud query optimizers based on our observations from the big data workloads at Microsoft. Second, we discuss what makes machine learning an attractive approach to aid the big data query optimizers in decision making. Third, we present Microlearner, a practical approach to characterize large cloud workloads into smaller subsets and build micromodels over each subset to tame the complexity of big data workloads And finally, we describe the productization of Microlearner, using learned cardinality as a concrete example, via performance results over very large production workloads and illustrating the various challenges involved in deployment.
引用
收藏
页码:2423 / 2434
页数:12
相关论文
共 50 条
  • [41] Fine-grained Partitioning for Aggressive Data Skipping
    Sun, Liwen
    Franklin, Michael J.
    Krishnan, Sanjay
    Xin, Reynold S.
    SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 1115 - 1126
  • [42] Fine-Grained Data Committing for Persistent Memory
    Lu, Tianyue
    Liu, Yuhang
    Chen, Mingyu
    2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 438 - 443
  • [43] Authenticated Data Redaction with Fine-Grained Control
    Ma, Jinhua
    Liu, Jianghua
    Huang, Xinyi
    Xiang, Yang
    Wu, Wei
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2020, 8 (02) : 291 - 302
  • [44] A data augment method for fine-grained recognition
    Zhang Y.
    Hu Z.
    Tian S.
    Zhang, Yin (yinzh@zju.edu.cn), 2018, Computer Society of the Republic of China (29) : 12 - 18
  • [45] Fine-Grained Data Selection for Improved Energy Efficiency of Federated Edge Learning
    Albaseer, Abdullatif
    Abdallah, Mohamed
    Al-Fuqaha, Ala
    Erbad, Aiman
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (05): : 3258 - 3271
  • [46] Supervised spectral feature learning for fine-grained classification in small data set
    He, Xiaoxu
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [47] Deep learning empowers fine-grained population affinity estimation with craniometric data
    Liu, Xiaoming
    Pang, Jinyong
    AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY, 2023, 180 : 105 - 106
  • [48] Enhancing the Data Learning With Physical Knowledge in Fine-Grained Air Pollution Inference
    Ma, Rui
    Liu, Ning
    Xu, Xiangxiang
    Wang, Yue
    Noh, Hae Young
    Zhang, Pei
    Zhang, Lin
    IEEE ACCESS, 2020, 8 : 88372 - 88384
  • [49] Using and Collecting Fine-Grained Usage Data to Improve Online Learning Materials
    Leppanen, Leo
    Leinonen, Juho
    Ihantola, Petri
    Hellas, Arto
    2017 IEEE/ACM 39TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING EDUCATION AND TRAINING TRACK (ICSE-SEET 2017), 2017, : 4 - 12
  • [50] Fine-Grained Big Traffic Data Reverse-charge System: A Method of Saving Expenses
    Su, Xin
    Meng, Leilei
    Wang, Ziyu
    Du, Chunsai
    Choi, Chang
    MOBILE NETWORKS & APPLICATIONS, 2018, 23 (04): : 1082 - 1088