A Holistic Approach to Data Access for Cloud-Native Analytics and Machine Learning

被引:3
|
作者
Koutsovasilis, Panos [1 ]
Venugopal, Srikumar [1 ]
Gkoufas, Yiannis [1 ]
Pinto, Christian [1 ]
机构
[1] IBM Res Europe Dublin, Dublin, Ireland
关键词
D O I
10.1109/CLOUD53861.2021.00084
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud providers offer a variety of storage solutions for hosting data, both in price and in performance. For Analytics and machine learning applications, object storage services are the go-to solution for hosting the datasets that exceed tens of gigabytes in size. However, such a choice results in performance degradation for these applications and requires extra engineering effort in the form of code changes to access the data on remote storage. In this paper, we present a generic end-to-end solution that offers seamless data access for remote object storage services, transparent data caching within the compute infrastructure, and data-aware topologies that boost the performance of applications deployed in Kubernetes. We introduce a custom-implemented cache mechanism that supports all the requirenents of the former and we demonstrate that our holistic solution leads up to 48% improvement for Spark implenentation of the TPC-DS benchmark and up to 191% improvement for the training of deep learning models from the MLPerf benchmark suite.
引用
收藏
页码:654 / 659
页数:6
相关论文
共 50 条
  • [1] Cloud-Native Transactions and Analytics in SingleStore
    Prout, Adam
    Wang, Szu-Po
    Victor, Joseph
    Sun, Zhou
    Li, Yongzhu
    Chen, Jack
    Bergeron, Evan
    Hanson, Eric
    Walzer, Robert
    Gomes, Rodrigo
    Shamgunov, Nikita
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 2340 - 2352
  • [2] Proactive Autoscaling for Cloud-Native Applications using Machine Learning
    Marie-Magdelaine, Nicolas
    Ahmed, Toufik
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [3] Holistic approach to machine tool data analytics
    Lenz, Juergen
    Wuest, Thorsten
    Westkaemper, Engelbert
    JOURNAL OF MANUFACTURING SYSTEMS, 2018, 48 : 180 - 191
  • [4] Machine Learning based Interference Modelling in Cloud-Native Applications
    Baluta, Alexandru
    Mukherjee, Joydeep
    Litoiu, Marin
    PROCEEDINGS OF THE 2022 ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING (ICPE '22), 2022, : 125 - 132
  • [5] Assessment of Performance for Cloud-Native Machine Learning on Edge Devices
    Clapa, Konrad
    Grudzien, Krzysztof
    Sierszen, Artur
    DIGITAL INTERACTION AND MACHINE INTELLIGENCE, MIDI 2023, 2024, 1076 : 95 - 105
  • [6] Holistic Root Cause Analysis for Failures in Cloud-Native Systems Through Observability Data
    Han, Yongqi
    Du, Qingfeng
    Huang, Ying
    Li, Pengsheng
    Shi, Xiaonan
    Wu, Jiaqi
    Fang, Pei
    Tian, Fulong
    He, Cheng
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 3789 - 3802
  • [7] Cloud-Native Repositories for Big Scientific Data
    Abernathey, Ryan P.
    Blackmon-Luca, Charles C.
    Crone, Timothy J.
    Henderson, Naomi
    Lepore, Chiara
    Augspurger, Tom
    Banihirwe, Anderson
    Gentemann, Chelle L.
    Hamman, Joseph J.
    Henderson, Naomi
    Lepore, Chiara
    McCaie, Theo A.
    Robinson, Niall H.
    Signell, Richard P.
    COMPUTING IN SCIENCE & ENGINEERING, 2021, 23 (02) : 26 - 35
  • [8] Machine learning with big data analytics for cloud security
    Mohammad, Abdul Salam
    Pradhan, Manas Ranjan
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 96
  • [9] Cloud-Native Repositories for Big Scientific Data
    Abernathey, Ryan P.
    Augspurger, Tom
    Banihirwe, Anderson
    Blackmon-Luca, Charles C.
    Crone, Timothy J.
    Gentemann, Chelle L.
    Hamman, Joseph J.
    Henderson, Naomi
    Lepore, Chiara
    McCaie, Theo A.
    Robinson, Niall H.
    Signell, Richard P.
    Computing in Science and Engineering, 2021, 23 (02): : 26 - 35
  • [10] Toward Cloud-Native, Machine Learning Base Detection of Crop Disease With Imaging Spectroscopy
    Rubambiza, Gloire
    Galvan, Fernando Romero
    Pavlick, Ryan
    Weatherspoon, Hakim
    Gold, Kaitlin M.
    JOURNAL OF GEOPHYSICAL RESEARCH-BIOGEOSCIENCES, 2023, 128 (06)