Online Performance Analysis: An Event-based Workflow Design Towards Exascale

被引:0
|
作者
Wagner, Michael [1 ]
Hilbrich, Tobias [1 ]
Brunst, Holger [1 ]
机构
[1] Tech Univ Dresden, Ctr Informat Serv & High Performance Comp ZIH, D-01062 Dresden, Germany
来源
2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS) | 2014年
关键词
D O I
10.1109/HPCC.2014.145
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Today, it is commonly accepted that speedup and efficiency are not granted automatically when developing or porting software for High Performance Computing (HPC) platforms. The reasons are manifold and actively investigated in the community. Software monitors are essential for these studies as they provide the raw performance data to be analyzed. We propose an online monitoring workflow for event-based performance analysis that takes into account the significant changes in system architecture towards the development of exascale supercomputers. Critical properties are: communication across large numbers of processing elements, limited I/O capabilities, and a decreasing memory-per- core ratio. We present a hierarchical data management and steering workflow that directly couples the application, measurement, and analysis processes, thus eliminating the need for extensive communication and buffering. The workflow is closely integrated with the native system communication API to enable best communication across processing elements. The memory issue is addressed with a new lossy hierarchical data compression technique for in-memory storage, intended for small, fixed-size buffers. Further, we abandon secondary storage to avoid potential I/O challenges. We demonstrate the feasibility of our design with a prototype implementation that features services for data collection, analysis, and runtime compression. Our evaluation extrapolates results obtained with the NAS Parallel Benchmarks at up to 2,048 processes to an exascale workflow.
引用
收藏
页码:839 / 846
页数:8
相关论文
共 50 条
  • [41] A methodology for performance modeling of distributed event-based systems
    Kounev, Samuel
    Sachs, Kai
    Bacon, Jean
    Buchmann, Alejandro
    ISORC 2008: 11TH IEEE SYMPOSIUM ON OBJECT/COMPONENT/SERVICE-ORIENTED REAL-TIME DISTRIBUTED COMPUTING - PROCEEDINGS, 2008, : 13 - +
  • [42] Event-based safety and reliability analysis integration in model-based space mission design
    Hu, Yunpeng
    Peng, Qibo
    Ni, Qing
    Wu, Xinfeng
    Ye, Dongming
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2023, 229
  • [43] Event-based Detection of Changes in IaaS Performance Signatures
    Fattah, Sheik Mohammad Mostakim
    Bouguettaya, Athman
    2020 IEEE 13TH INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2020), 2020, : 210 - 217
  • [44] EVENT-BASED PERFORMANCE PERTURBATION - A CASE-STUDY
    MALONY, AD
    SIGPLAN NOTICES, 1991, 26 (07): : 201 - 212
  • [45] Integration of an Event-Based Simulation Framework into a Scientific Workflow Execution Environment for Grids and Clouds
    Ostermann, Simon
    Plankensteiner, Kassian
    Bodner, Daniel
    Kraler, Georg
    Prodan, Radu
    TOWARDS A SERVICE-BASED INTERNET, 2011, 6994 : 1 - +
  • [46] An Event-Based Online Scheduling Approach for Networked Embedded Control Systems
    Reimann, Sven
    Al-Areqi, Sanad
    Liu, Steven
    2013 AMERICAN CONTROL CONFERENCE (ACC), 2013, : 5326 - 5331
  • [47] ASAP: adaptive transmission scheme for online processing of event-based algorithms
    R. Tapia
    J. R. Martínez-de Dios
    A. Gómez Eguíluz
    A. Ollero
    Autonomous Robots, 2022, 46 : 879 - 892
  • [48] ASAP: adaptive transmission scheme for online processing of event-based algorithms
    Tapia, R.
    Martinez-de Dios, J. R.
    Gomez Eguiluz, A.
    Ollero, A.
    AUTONOMOUS ROBOTS, 2022, 46 (08) : 879 - 892
  • [49] Design and performance of the COMPASS online event filter
    Kuhn, Roland
    Nagel, Thiemo
    Konopka, Robert
    Paul, Stephan
    Schmitt, Lars
    2007 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD, VOLS 1-11, 2007, : 1733 - +
  • [50] POLICY GRADIENT APPROACH OF EVENT-BASED OPTIMIZATION AND ITS ONLINE IMPLEMENTATION
    Xia, Li
    ASIAN JOURNAL OF CONTROL, 2014, 16 (06) : 1735 - 1743