Performance Analysis of Emerging Data Analytics and HPC Workloads

被引:0
|
作者
Daley, Christopher S. [1 ]
Dosanjh, Sudip [1 ]
Prabhat [1 ]
Wright, Nicholas J. [1 ]
机构
[1] Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
关键词
Workload characteristics; data analytics; big data; high performance computing; SEXTRACTOR;
D O I
10.1145/3149393.3149400
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Supercomputers are increasingly being used to run a data analytics workload in addition to a traditional simulation science workload. This mixed workload must be rigorously characterized to ensure that appropriately balanced machines are deployed. In this paper we analyze a suite of applications representing the simulation science and data workload at the NERSC supercomputing center. We show how time is spent in application compute, library compute, communication and I/O, and present application performance on both the Intel Xeon and Intel Xeon-Phi partitions of the Cori supercomputer. We find commonality in the libraries used, I/O motifs and methods of parallelism, and obtain similar node-to-node performance for the base application configurations. We demonstrate that features of the Intel Xeon-Phi node architecture and a Burst Buffer can improve application performance, providing evidence that an exascale-era energy-efficient platform can support a mixed workload.
引用
收藏
页码:43 / 48
页数:6
相关论文
共 50 条
  • [41] Differentiated Performance in NoSQL Database Access for Hybrid Cloud-HPC Workloads
    Andreoli, Remo
    Cucinotta, Tommaso
    HIGH PERFORMANCE COMPUTING - ISC HIGH PERFORMANCE DIGITAL 2021 INTERNATIONAL WORKSHOPS, 2021, 12761 : 439 - 449
  • [42] Does Varying BeeGFS Configuration Affect the I/O Performance of HPC Workloads?
    Borkar, Arnav
    Tony, Joel
    Vamsi, Hari K. N.
    Barman, Tushar
    Bhisikar, Yash
    Sreenath, T. M.
    Paul, Arnab K.
    2023 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING WORKSHOPS, CLUSTER WORKSHOPS, 2023, : 5 - 7
  • [43] Proxy Benchmarks for Emerging Big-data Workloads
    Panda, Reena
    John, Lizy Kurian
    2017 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), 2017, : 139 - 140
  • [44] SYMBIOSYS: A Methodology for Performance Analysis of Composable HPC Data Services
    Ramesh, Srinivasan
    Malony, Allen D.
    Carns, Philip
    Ross, Robert B.
    Dorier, Matthieu
    Soumagne, Jerome
    Snyder, Shane
    2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 35 - 45
  • [45] Performance Analysis of HPC Applications with Irregular Tree Data Structures
    Khawaja, Ahmed
    Wang, Jiajun
    Gerstlauer, Andreas
    John, Lizy K.
    Malhotra, Dhairya
    Biros, George
    2014 20TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2014, : 418 - 425
  • [46] Data and Visual Analytics for Emerging Databases
    Leung, Carson K.
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON EMERGING DATABASES: TECHNOLOGIES, APPLICATIONS, AND THEORY, 2018, 461 : 203 - 213
  • [47] Pharma Data Analytics : An Emerging Trend
    Kaddi, Shweta S.
    Patil, Malini M.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES FOR SMART NATION (SMARTTECHCON), 2017, : 1307 - 1311
  • [48] Towards Sustainability and Energy Efficiency Using Data Analytics for HPC Data Center
    Chinnici, Andrea
    Ahmadzada, Eyvaz
    Kor, Ah-Lian
    De Chiara, Davide
    Dominguez-Diaz, Adrian
    de Marcos Ortega, Luis
    Chinnici, Marta
    ELECTRONICS, 2024, 13 (17)
  • [49] Facilitating the HPC Data Center Host efficiency through Big Data Analytics
    Rager, Jack
    Liu, Fang Cherry
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 3280 - 3287
  • [50] Characterizing Scheduling Delay for Low-latency Data Analytics Workloads
    Chen, Wei
    Pi, Aidi
    Wang, Shaoqi
    Zhou, Xiaobo
    2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 630 - 639