A distributed multi-storage I/O system for data intensive scientific computing

被引:1
|
作者
Shen, XH [1 ]
Choudhary, A [1 ]
机构
[1] Northwestern Univ, Dept Elect & Comp Engn, Ctr Parallel & Distributed Comp, Evanston, IL 60208 USA
关键词
multi-storage I/O system; access pattern; data intensive computing;
D O I
10.1016/j.parco.2003.05.009
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
More and more parallel applications are running in a distributed environment to take advantage of easily available and inexpensive commodity resources. For data intensive applications, employing multiple distributed storage resources has many advantages. In this paper, we present a Multi-Storage I/O System (MS-I/O) that cannot only effectively manage various distributed storage resources in the system, but also provide novel high performance storage access schemes. MS-I/O employs many state-of-the-art I/O optimizations such as collective I/O, asynchronous I/O etc. and a number of new techniques such as data location, data replication, subfile, superfile and data access history. In addition, many MS-I/O optimization schemes can work simultaneously within a single data access session, greatly improving the performance. Although I/O optimization techniques can help improve performance, it also complicates I/O system. In addition, most optimization techniques have their limitations. Therefore, selecting accurate optimization policies requires expert knowledge which is not suitable for end users who may have little knowledge of I/O techniques. So the task of I/O optimization decision should be left to the I/O system itself, that is, automatic from user's point of view. We present a User Access Pattern data structure which is associated with each dataset that can help MS-I/O easily make accurate I/O optimization decisions. (C) 2003 Elsevier B.V. All rights reserved.
引用
收藏
页码:1623 / 1643
页数:21
相关论文
共 50 条
  • [1] A Distributed Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing
    X. Shen
    A. Choudhary
    C. Matarazzo
    P. Sinha
    Cluster Computing, 2003, 6 (3) : 189 - 200
  • [2] A distributed multi-storage resource architecture and I/O performance prediction for scientific computing
    Shen, XH
    Choudhary, A
    NINTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE DISTRIBUTED COMPUTING, PROCEEDINGS, 2000, : 21 - 30
  • [3] MS-I/O: A distributed multi-storage I/O system
    Shen, XH
    Choudhary, A
    CCGRID 2002: 2ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, PROCEEDINGS, 2002, : 163 - 172
  • [4] Scalable Distributed Storage for Multidimensional Scientific Data Computing
    Kokoulin, A. N.
    Dadenkov, S. A.
    PROCEEDINGS OF 2017 XX IEEE INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND MEASUREMENTS (SCM), 2017, : 596 - 599
  • [5] Distributed parallel file system for I/O intensive parallel computing on clusters
    Domínguez-Domínguez, S
    Buenabad-Chávez, J
    2004 1st International Conference on Electrical and Electronics Engineering (ICEEE), 2004, : 194 - 199
  • [6] A generic framework for distributed multi-generation and multi-storage energy systems
    Khalilpour, Kaveh Rajab
    Vassallo, Anthony
    ENERGY, 2016, 114 : 798 - 813
  • [7] Beyond the storage area network: Data intensive computing in a distributed environment
    Duffy, D
    Acks, N
    Noga, V
    Schardt, T
    Gary, JP
    Fink, B
    Kobler, B
    Donovan, M
    McElvaney, J
    Twenty-Second IEEE/Thirteenth NASA Goddard Conference on Mass Storage Systems and Technologies, Proceedings: INFORMATION RETRIEVAL FROM VERY LARGE STORAGE SYSTEMS, 2005, : 232 - 236
  • [8] Multi-storage hybrid system approach and experimental investigations
    Bocklisch, Thilo
    Boettiger, Michael
    Paulitschke, Martin
    8TH INTERNATIONAL RENEWABLE ENERGY STORAGE CONFERENCE AND EXHIBITION (IRES 2013), 2014, 46 : 186 - 193
  • [10] Distributed Data Access/Find System with Metadata for Data-Intensive Computing
    Ikebe, Minoru
    Inomata, Atsuo
    Fujikawa, Kazutoshi
    Sunahara, Hideki
    2008 9TH IEEE/ACM INTERNATIONAL CONFERENCE ON GRID COMPUTING, 2008, : 361 - 366