Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression

被引:10
|
作者
Bicer, Tekin [1 ]
Yin, Jian [2 ]
Agrawal, Gagan [1 ]
机构
[1] Ohio State Univ, Comp Sci & Engn, Columbus, OH 43210 USA
[2] Pacific NW Natl Lab, Richland, WA 99352 USA
关键词
D O I
10.1109/CCGrid.2014.112
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Increasing number of cores in parallel computer systems are allowing scientific simulations to be executed with increasing spatial and temporal granularity. However, this also implies that increasing larger-sized datasets need to be output, stored, managed, and then visualized and/or analyzed using a variety of methods. In examining the possibility of using compression to accelerate all of these steps, we focus on two important questions: "Can compression help save time when data is output from, or input into, a parallel program?", and "How can a scientist's effort in using compression with a parallel program be minimized?". We focus on PnetCDF, and show how transparent compression can be supported, thus allowing an existing simulation program to start outputting and storing data in a compressed fashion, and similarly, allow a data analysis application to read compressed data. We address challenges in supporting compression when parallel writes are being performed. In our experiments, we first analyze the effects of using compression with microbenchmarks, and then, continue our evaluation using a scientific simulation application, and two data analysis applications. While we obtain up to a factor of 2 improvement in performance for microbenchmarks, the execution time of simulation application is improved up to 22%, and the maximum speedup of data analysis applications is 1.83 (with an average speedup of 1.36).
引用
收藏
页码:1 / 10
页数:10
相关论文
共 50 条
  • [1] Improving I/O Forwarding Throughput with Data Compression
    Welton, Benjamin
    Kimpe, Dries
    Cope, Jason
    Patrick, Christina M.
    Iskra, Kamil
    Ross, Robert
    2011 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2011, : 438 - 445
  • [2] User-controllable parallel I/O for scientific applications
    Lee, JS
    Song, Y
    1998 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1998, : 792 - 799
  • [3] Transparent Asynchronous Parallel I/O Using Background Threads
    Tang, Houjun
    Koziol, Quincey
    Ravi, John
    Byna, Suren
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (04) : 891 - 902
  • [4] A Novel Model for Synthesizing Parallel I/O Workloads in Scientific Applications
    Feng, Dan
    Zou, Qiang
    Jiang, Hong
    Zhu, Yifeng
    2008 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, 2008, : 252 - 261
  • [5] Automated tuning of parallel I/O systems: An approach to portable I/O performance for scientific applications
    Chen, Y
    Winslett, M
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2000, 26 (04) : 362 - 383
  • [6] Improving I/O Performance with Adaptive Data Compression for Big Data Applications
    Zou, Hongbo
    Yu, Yongen
    Tang, Wei
    Chen, Hsuanwei Michelle
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 1229 - 1238
  • [7] Using Transparent Compression to Improve SSD-based I/O Caches
    Makatos, Thanos
    Klonatos, Yannis
    Marazakis, Manolis
    Flouris, Michail D.
    Bilas, Angelos
    EUROSYS'10: PROCEEDINGS OF THE EUROSYS 2010 CONFERENCE, 2010, : 1 - 14
  • [8] Improving the performance of scientific parallel applications in a cluster of workstations
    Flores, A
    García, JM
    APPLIED PARALLEL COMPUTING: LARGE SCALE SCIENTIFIC AND INDUSTRIAL PROBLEMS, 1998, 1541 : 134 - 141
  • [9] COMPASSION: A parallel I/O runtime system including chunking and compression for irregular applications
    Carretero, J
    No, J
    Park, SS
    Choudhary, A
    Chen, P
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1998, 1401 : 668 - 677
  • [10] Parallel I/O for scientific applications on heterogeneous clusters: A resource-utilization approach
    Cho, Yong E.
    Winslett, Marianne
    Kuo, Szu-wen
    Lee, Jonghyun
    Chen, Ying
    Proceedings of the International Conference on Supercomputing, 1999, : 253 - 259