Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression

被引:10
|
作者
Bicer, Tekin [1 ]
Yin, Jian [2 ]
Agrawal, Gagan [1 ]
机构
[1] Ohio State Univ, Comp Sci & Engn, Columbus, OH 43210 USA
[2] Pacific NW Natl Lab, Richland, WA 99352 USA
关键词
D O I
10.1109/CCGrid.2014.112
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Increasing number of cores in parallel computer systems are allowing scientific simulations to be executed with increasing spatial and temporal granularity. However, this also implies that increasing larger-sized datasets need to be output, stored, managed, and then visualized and/or analyzed using a variety of methods. In examining the possibility of using compression to accelerate all of these steps, we focus on two important questions: "Can compression help save time when data is output from, or input into, a parallel program?", and "How can a scientist's effort in using compression with a parallel program be minimized?". We focus on PnetCDF, and show how transparent compression can be supported, thus allowing an existing simulation program to start outputting and storing data in a compressed fashion, and similarly, allow a data analysis application to read compressed data. We address challenges in supporting compression when parallel writes are being performed. In our experiments, we first analyze the effects of using compression with microbenchmarks, and then, continue our evaluation using a scientific simulation application, and two data analysis applications. While we obtain up to a factor of 2 improvement in performance for microbenchmarks, the execution time of simulation application is improved up to 22%, and the maximum speedup of data analysis applications is 1.83 (with an average speedup of 1.36).
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [21] Collective buffering: Improving parallel I/O performance
    Nitzberg, B
    Lo, V
    SIXTH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, PROCEEDINGS, 1997, : 148 - 157
  • [22] Accelerating memory and I/O intensive HPC applications using hardware compression
    AlSaleh, Saleh
    Elrabaa, Muhammad E. S.
    El-Maleh, Aiman
    Daud, Khaled
    Hroub, Ayman
    Mudawar, Muhamed
    Tonellot, Thierry
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 193
  • [23] Improving Parallel I/O Performance Using Multithreaded Two-Phase I/O with Processor Affinity Management
    Tsujita, Yuichi
    Yoshinaga, Kazumi
    Hori, Atsushi
    Sato, Mikiko
    Namiki, Mitaro
    Ishikawa, Yutaka
    PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2013), PT I, 2014, 8384 : 714 - 723
  • [24] Automatic Cloud I/O Configurator for I/O Intensive Parallel Applications
    Zhai, Jidong
    Liu, Mingliang
    Jin, Ye
    Ma, Xiaosong
    Chen, Wenguang
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (12) : 3275 - 3288
  • [25] I/O requirements of scientific applications: An evolutionary view
    Smirni, E
    Aydt, RA
    Chien, AA
    Reed, DA
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, 1996, : 49 - 59
  • [26] High-Throughput Parallel-I/O using SIONlib for Mesoscopic Particle Dynamics Simulations on Massively Parallel Computers
    Freche, Jens
    Frings, Wolfgang
    Sutmann, Godehard
    PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 371 - 378
  • [27] I/O optimization in the checkpointing of OpenMP parallel applications
    Losada, Nuria
    Martin, Maria J.
    Rodriguez, Gabriel
    Gonzalez, Patricia
    23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015), 2015, : 222 - 229
  • [28] Analyzing the Parallel I/O Severity of MPI Applications
    Mendez, Sandra
    Rexachs, Dolores
    Luque, Emilio
    2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 953 - 962
  • [29] A study of real world I/O performance in parallel scientific computing
    Kimpe, Dries
    Lani, Andrea
    Quintino, Tiago
    Vandewalle, Stefan
    Poedts, Stefaan
    Deconinck, Herman
    APPLIED PARALLEL COMPUTING: STATE OF THE ART IN SCIENTIFIC COMPUTING, 2007, 4699 : 871 - +
  • [30] Using clustering to address heterogeneity and dynamism in parallel scientific applications
    Li, XL
    Parashar, M
    HIGH PERFORMANCE COMPUTING - HIPC 2005, PROCEEDINGS, 2005, 3769 : 247 - 257