Challenges of Large-scale Biomedical Workflows on the Cloud - A Case Study on the Need for Reproducibility of Results

被引:5
|
作者
Kanwal, Sehrish [1 ]
Lonie, Andrew [1 ]
Sinnott, Richard O. [1 ]
Anderson, Charlotte [1 ]
机构
[1] Univ Melbourne, Dept Comp & Informat Syst, Melbourne, Vic 3010, Australia
关键词
bioinformatics workflows; distributed compute resources; exome; NeCTAR Research Cloud; reproducibility; SQUAMOUS-CELL CARCINOMA; QUALITY; HEAD; TOOL;
D O I
10.1109/CBMS.2015.28
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computational bioinformatics workflows are extensively used to analyse genomics data. With the unprecedented advancements in genomic sequence technology and opportunities for personalized medicines, it is essential that analysis results are repeatable by others, especially when moving into clinical environment. To cope with the complex computational demands of huge biological datasets, a shift to distributed compute resources is unavoidable. A case study was conducted in which three well-established bioinformatics analysis groups across Australia were assigned to analyse exome sequence data from a range of patients with a rare condition: disorder of sex development. Initially these groups used their own in-house data processing pipelines, and subsequently used a common bioinformatics workbench based upon Galaxy and offered through the Australia-wide National eResearch Collaboration Tools and Resources (NeCTAR) Research Cloud. This paper describes the experiences in this work and the variability of results. We put forward principles that should be used to ensure reproducibility of scientific results moving forward.
引用
收藏
页码:220 / 225
页数:6
相关论文
共 50 条
  • [21] The large-scale structure of the Large Magellanic Cloud
    Staveley-Smith, L
    SEEING THROUGH THE DUST: THE DETECTION OF HI AND THE EXPLORATION OF THE ISM IN GALAXIES, PROCEEDINGS, 2002, 276 : 391 - 394
  • [22] Opportunities and Challenges of Integrated Large-Scale PFAS Modeling: A Case Study for PFAS Modeling at a Watershed Scale
    Raschke, Anna
    Nejadhashemi, A. Pouyan
    Rafiei, Vahid
    Fernandez, Nicolas
    Shabani, Afshin
    Li, Shu-Guang
    JOURNAL OF ENVIRONMENTAL ENGINEERING, 2022, 148 (09)
  • [23] Two years into the Brazilian Reproducibility Initiative: reflections on conducting a large-scale replication of Brazilian biomedical science
    Neves, Kleber
    Carneiro, Clarissa F. D.
    Wasilewska-Sampaio, Ana Paula
    Abreu, Mariana
    Valerio-Gomes, Bruna
    Tan, Pedro B.
    Amaral, Olavo B.
    MEMORIAS DO INSTITUTO OSWALDO CRUZ, 2020, 115
  • [24] Testing large-scale cloud management
    Citron, D.
    Zlotnick, A.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2011, 55 (06)
  • [25] The need for large-scale randomized evidence
    Baigent, C
    BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, 1997, 43 (04) : 349 - 353
  • [26] The Need for Large-Scale Victim Reparations
    Terselic, Vesna
    ASSESSING THE LEGACY OF THE ICTY, 2011, : 95 - 98
  • [27] Robots as-a-Service in Cloud Computing: Search and Rescue in Large-scale Disasters Case Study
    Mouradian, Carla
    Yangui, Sami
    Glitho, Roch H.
    2018 15TH IEEE ANNUAL CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE (CCNC), 2018,
  • [28] Generalizable Coordination of Large Multiscale Workflows: Challenges and Learnings at Scale
    Bhatia, Harsh
    Di Natale, Francesco
    Moon, Joseph Y.
    Zhang, Xiaohua
    Chavez, Joseph R.
    Aydin, Fikret
    Stanley, Chris
    Oppelstrup, Tomas
    Neale, Chris
    Schumacher, Sara Kokkila
    Ahn, Dong H.
    Herbein, Stephen
    Carpenter, Timothy S.
    Gnanakaran, Sandrasegaram
    Bremer, Peer-Timo
    Glosli, James N.
    Lightstone, Felice C.
    Ingolfsson, Helgi I.
    SC21: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2021,
  • [29] A study on large-scale disease causality discovery from biomedical literature
    Yu, Shirui
    Dong, Peng
    Li, Junlian
    Tang, Xiaoli
    Li, Xiaoying
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
  • [30] A qualitative study of large-scale recommendation algorithms for biomedical knowledge bases
    Ehsan Noei
    Tsahi Hayat
    Jessica Perrie
    Recep Çolak
    Yanqi Hao
    Shankar Vembu
    Kelly Lyons
    Sam Molyneux
    International Journal on Digital Libraries, 2021, 22 : 197 - 215