Wrangling distributed computing for high- throughput environmental science: An introduction to HTCondor

被引:13
|
作者
Erickson, Richard A. [1 ]
Fienen, Michael N. [2 ]
McCalla, S. Grace [1 ,4 ]
Weiser, Emily L. [1 ]
Bower, Melvin L. [1 ]
Knudson, Jonathan M. [1 ]
Thain, Greg [3 ]
机构
[1] US Geol Survey, Upper Midwest Environm Sci Ctr, La Crosse, WI 54601 USA
[2] US Geol Survey, Wisconsin Water Sci Ctr, Middelton, WI USA
[3] Univ Wisconsin, Dept Comp Sci, 1210 W Dayton St, Madison, WI 53706 USA
[4] Univ Wisconsin, Wisconsin Inst Discovery, Madison, WI USA
关键词
CLIMATE-CHANGE; IMPACTS;
D O I
10.1371/journal.pcbi.1006468
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Biologists and environmental scientists now routinely solve computational problems that were unimaginable a generation ago. Examples include processing geospatial data, analyzing -omics data, and running large-scale simulations. Conventional desktop computing cannot handle these tasks when they are large, and high-performance computing is not always available nor the most appropriate solution for all computationally intense problems. High-throughput computing (HTC) is one method for handling computationally intense research. In contrast to high-performance computing, which uses a single "supercomputer," HTC can distribute tasks over many computers (e.g., idle desktop computers, dedicated servers, or cloud-based resources). HTC facilities exist at many academic and government institutes and are relatively easy to create from commodity hardware. Additionally, consortia such as Open Science Grid facilitate HTC, and commercial entities sell cloud-based solutions for researchers who lack HTC at their institution. We provide an introduction to HTC for biologists and environmental scientists. Our examples from biology and the environmental sciences use HTCondor, an open source HTC system.
引用
收藏
页数:8
相关论文
共 28 条
  • [21] High-throughput screening and environmental risk assessment: State of the science and emerging applications
    Villeneuve, Daniel L.
    Coady, Katie
    Escher, Beate I.
    Mihaich, Ellen
    Murphy, Cheryl A.
    Schlekat, Tamar
    Garcia-Reyero, Natalia
    ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY, 2019, 38 (01) : 12 - 26
  • [22] High-throughput screening of organic photovoltaics using volunteer distributed computing: The Harvard clean energy project
    Aspuru-Guzik, Alan
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 244
  • [23] Developing Distributed High-performance Computing Capabilities of an Open Science Platform for Robust Epidemic Analysis
    Collier, Nicholson
    Wozniak, Justin M.
    Stevens, Abby
    Babuji, Yadu
    Binois, Mickael
    Fadikar, Arindam
    Wurth, Alexandra
    Chard, Kyle
    Ozik, Jonathan
    2023 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW, 2023, : 868 - 877
  • [24] A Case Study of leveraging High-Throughput Distributed Message Queue System for Many-Task Computing on Hadoop
    Cao Nguyen
    Kim, Jik-Soo
    Lee, Jaehwan
    Hwang, Soonwook
    2017 IEEE 2ND INTERNATIONAL WORKSHOPS ON FOUNDATIONS AND APPLICATIONS OF SELF* SYSTEMS (FAS*W), 2017, : 257 - 262
  • [25] Introducing High-Throughput Environmental Metabolomics: ES&T's Top Science Paper 2010
    Schnoor, Jerald L.
    ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2011, 45 (07) : 2520 - 2520
  • [26] GRAPLEr: A distributed collaborative environment for lake ecosystem modeling that integrates overlay networks, high-throughput computing, and WEB services
    Subratie, Kensworth C.
    Aditya, Saumitra
    Mahesula, Srinivas
    Figueiredo, Renato
    Carey, Cayelan C.
    Hanson, Paul C.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (13):
  • [27] High-performance computing for the study of earth and environmental science materials using synchrotron x-ray computed microtomography
    Feng, H
    Jones, KW
    Mcguigan, M
    Smith, GJ
    Spiletic, J
    INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATION TECHNOLOGIES : EXPLORING EMERGING TECHNOLOGIES, 2001, : 471 - 480
  • [28] Democratizing the computational environmental marine data science: using the High-Performance Cloud-Native Computing for inert transport and diffusion Lagrangian modelling
    Mellone, Gennaro
    De Vita, Ciro Giuseppe
    Zambianchi, Enrico
    Singh, David Exposito
    Di Luccio, Diana
    Montella, Raffaele
    2022 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR THE SEA LEARNING TO MEASURE SEA HEALTH PARAMETERS (METROSEA), 2022, : 267 - 272