Benchmarking the effects of operating system interference on extreme-scale parallel machines

被引:46
|
作者
Beckman, Pete [1 ]
Iskra, Kamil [1 ]
Yoshii, Kazutomo [1 ]
Coghlan, Susan [1 ]
Nataraj, Aroon [2 ]
机构
[1] Argonne Natl Lab, Div Math & Comp Sci, Argonne, IL 60439 USA
[2] Univ Oregon, Dept Comp & Informat Sci, Eugene, OR 97403 USA
关键词
microbenchmark; noise; petascale; synchronicity;
D O I
10.1007/s10586-007-0047-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We investigate operating system noise, which we identify as one of the main reasons for a lack of synchronicity in parallel applications. Using a microbenchmark, we measure the noise on several contemporary platforms and find that, even with a general-purpose operating system, noise can be limited if certain precautions are taken. We then inject artificially generated noise into a massively parallel system and measure its influence on the performance of collective operations. Our experiments indicate that on extreme-scale platforms, the performance is correlated with the largest interruption to the application, even if the probability of such an interruption on a single process is extremely small. We demonstrate that synchronizing the noise can significantly reduce its negative influence.
引用
收藏
页码:3 / 16
页数:14
相关论文
共 42 条
  • [1] Benchmarking the effects of operating system interference on extreme-scale parallel machines
    Pete Beckman
    Kamil Iskra
    Kazutomo Yoshii
    Susan Coghlan
    Aroon Nataraj
    Cluster Computing, 2008, 11 : 3 - 16
  • [2] mOS: An Architecture for Extreme-Scale Operating Systems
    Wisniewski, Robert W.
    Inglett, Todd
    Keppel, Pardo
    Murty, Ravi
    Riesen, Rolf
    PROCEEDINGS OF THE 4TH INTERNATIONAL WORKSHOP ON RUNTIME AND OPERATING SYSTEMS FOR SUPERCOMPUTERS, ROSS 2014, 2014,
  • [3] Extreme-scale parallel computing: bottlenecks and strategies
    Ze-yao Mo
    Frontiers of Information Technology & Electronic Engineering, 2018, 19 : 1251 - 1260
  • [4] Extreme-scale parallel computing: bottlenecks and strategies
    Mo, Ze-yao
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (10) : 1251 - 1260
  • [5] Epidemic Fault Tolerance for Extreme-Scale Parallel Computing
    Katti, Amogh
    Di Fatta, Giuseppe
    INTERNET AND DISTRIBUTED COMPUTING SYSTEMS, IDCS 2015, 2015, 9258 : 201 - 208
  • [6] NSIM: An Interconnection Network Simulator for Extreme-Scale Parallel Computers
    Miwa, Hideki
    Susukita, Ryutaro
    Shibamura, Hidetomo
    Hirao, Tomoya
    Maki, Jun
    Yoshida, Makoto
    Kando, Takayuki
    Ajima, Yuichiro
    Miyoshi, Ikuo
    Shimizu, Toshiyuki
    Oinaga, Yuji
    Ando, Hisashige
    Inadomi, Yuichi
    Inoue, Koji
    Aoyagi, Mutsumi
    Murakami, Kazuaki
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (12): : 2298 - 2308
  • [7] Toward an extreme-scale electronic structure system
    Vallejo, Jorge L. Galvez
    Snowdon, Calum
    Stocks, Ryan
    Kazemian, Fazeleh
    Yu, Fiona Chuo Yan
    Seidl, Christopher
    Seeger, Zoe
    Alkan, Melisa
    Poole, David
    Westheimer, Bryce M.
    Basha, Mehaboob
    De La Pierre, Marco
    Rendell, Alistair
    Izgorodina, Ekaterina I.
    Gordon, Mark S.
    Barca, Giuseppe M. J.
    JOURNAL OF CHEMICAL PHYSICS, 2023, 159 (04):
  • [8] Asking the Right Questions: Benchmarking Fault-Tolerant Extreme-Scale Systems
    Widener, Patrick M.
    Ferreira, Kurt B.
    Levy, Scott
    Bridges, Patrick G.
    Arnold, Dorian
    Brightwell, Ron
    EURO-PAR 2013: PARALLEL PROCESSING WORKSHOPS, 2014, 8374 : 717 - 726
  • [9] JXPAMG: a parallel algebraic multigrid solver for extreme-scale numerical simulations
    Xu, Xiaowen
    Yue, Xiaoqiang
    Mao, Runzhang
    Deng, Yuntong
    Huang, Silu
    Zou, Haifeng
    Liu, Xiao
    Hu, Shaoliang
    Feng, Chunsheng
    Shu, Shi
    Mo, Zeyao
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2023, 5 (01) : 72 - 83
  • [10] Profiling the Usage of an Extreme-Scale Archival Storage System
    Sim, Hyogi
    Vazhkudai, Sudharshan S.
    2019 IEEE 27TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2019), 2019, : 410 - 422