P-found: Grid-enabling distributed repositories of protein folding and unfolding simulations for data mining

被引:4
|
作者
Swain, Martin [3 ]
Silva, Candida G. [1 ,2 ]
Loureiro-Ferreira, Nuno [1 ,2 ]
Ostropytskyy, Vitaliy [3 ]
Brito, Joao [4 ]
Riche, Olivier [3 ]
Stahl, Frederick [3 ]
Dubitzky, Werner [3 ]
Brito, Rui M. M. [1 ,2 ]
机构
[1] Univ Coimbra, Dept Chem, Fac Sci & Technol, P-3004535 Coimbra, Portugal
[2] Univ Coimbra, Ctr Neurosci & Cell Biol, P-3004535 Coimbra, Portugal
[3] Univ Ulster, Coleraine BT52 1SA, Londonderry, North Ireland
[4] Crit Software SA, P-3045504 Coimbra, Portugal
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2010年 / 26卷 / 03期
关键词
Data mining; Distributed systems; Service-oriented architecture; Grid; MOLECULAR-DYNAMICS; DESIGN; DYNAMEOMICS;
D O I
10.1016/j.future.2009.08.008
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Mound protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in Mound: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed Mound installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories - this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:424 / 433
页数:10
相关论文
共 6 条
  • [1] P-found GRID: A distributed repository for protein folding and unfolding simulations
    Silva, Candida G.
    Brito, Joao
    Swain, Martin
    Dubitzky, Werner
    Brito, Rui M. M.
    IBERGRID: 2ND IBERIAN GRID INFRASTRUCTURE CONFERENCE PROCEEDINGS, 2008, : 152 - +
  • [2] P-found: The protein folding and unfolding simulation repository
    Silva, Candida G.
    Ostropytskyy, Vitaliy
    Loureiro-Ferreira, Nuno
    Berrar, Daniel
    Dubitzky, Werner
    Brito, Rui M. M.
    Swain, Martin
    PROCEEDINGS OF THE 2006 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2006, : 101 - +
  • [3] Grid computing solutions for distributed repositories of protein folding and unfolding simulations
    Swain, Martin
    Ostropytskyy, Vitaliy
    Silva, Candida G.
    Stahl, Frederic
    Riche, Olivier
    Brito, Rui M. M.
    Dubitzky, Werner
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 3, 2008, 5103 : 70 - +
  • [4] P-found GRID: A Grid-enabled repository prototype for protein simulation data
    Silva, Candida G.
    Palma, Rui
    Simoes, Jose Rui
    Loureiro-Ferreira, Nuno
    Simoes, Carlos J. V.
    Swain, Martin
    Dubitzky, Werner
    Brito, Rui M. M.
    IBERGRID: 3RD IBERIAN GRID INFRASTRUCTURE CONFERENCE PROCEEDINGS, 2009, : 54 - 61
  • [5] Protein folding and unfolding simulations: A new challenge for data mining
    Brito, RMM
    Dubitzky, W
    Rodrigues, JR
    OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2004, 8 (02) : 153 - 166
  • [6] Grid-enabling data mining applications with DataMiningGrid: An architectural perspective
    Stankovski, Vlado
    Swain, Martin
    Kravtsov, Valentin
    Niessen, Thomas
    Wegener, Dennis
    Kindermann, Joerg
    Dubitzky, Werner
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING THEORY METHODS AND APPLICATIONS, 2008, 24 (04): : 259 - 279