Online Performance Analysis: An Event-based Workflow Design Towards Exascale

被引:0
|
作者
Wagner, Michael [1 ]
Hilbrich, Tobias [1 ]
Brunst, Holger [1 ]
机构
[1] Tech Univ Dresden, Ctr Informat Serv & High Performance Comp ZIH, D-01062 Dresden, Germany
来源
2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS) | 2014年
关键词
D O I
10.1109/HPCC.2014.145
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Today, it is commonly accepted that speedup and efficiency are not granted automatically when developing or porting software for High Performance Computing (HPC) platforms. The reasons are manifold and actively investigated in the community. Software monitors are essential for these studies as they provide the raw performance data to be analyzed. We propose an online monitoring workflow for event-based performance analysis that takes into account the significant changes in system architecture towards the development of exascale supercomputers. Critical properties are: communication across large numbers of processing elements, limited I/O capabilities, and a decreasing memory-per- core ratio. We present a hierarchical data management and steering workflow that directly couples the application, measurement, and analysis processes, thus eliminating the need for extensive communication and buffering. The workflow is closely integrated with the native system communication API to enable best communication across processing elements. The memory issue is addressed with a new lossy hierarchical data compression technique for in-memory storage, intended for small, fixed-size buffers. Further, we abandon secondary storage to avoid potential I/O challenges. We demonstrate the feasibility of our design with a prototype implementation that features services for data collection, analysis, and runtime compression. Our evaluation extrapolates results obtained with the NAS Parallel Benchmarks at up to 2,048 processes to an exascale workflow.
引用
收藏
页码:839 / 846
页数:8
相关论文
共 50 条
  • [21] Towards a Signal Calculus for Event-Based Synchronous Languages
    Zhao, Yongxin
    He Jifeng
    FORMAL METHODS AND SOFTWARE ENGINEERING, 2011, 6991 : 1 - 13
  • [22] Snowstorm event-based crash analysis
    Qin, Xiao
    Noyce, David A.
    Lee, Chanyoung
    Kinar, John R.
    MANAGEMENT AND DELIVERY OF MAINTENANCE AND OPERATIONS SERVICES, 2006, (1948): : 135 - 141
  • [23] Asynchronous Event-Based Fourier Analysis
    Sabatier, Quentin
    Ieng, Sio-Hoi
    Benosman, Ryad
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (05) : 2192 - 2202
  • [24] Motion analysis of event-based sensors
    Cox, Joseph
    Morley, Nicholas
    Ashok, Amit
    COMPUTATIONAL IMAGING V, 2020, 11396
  • [25] Event-Based Mobility Modeling and Analysis
    Jiang, Jian-Min
    Zhu, Huibiao
    Li, Qin
    Zhao, Yongxin
    Zhao, Lin
    Zhang, Shi
    Gong, Ping
    Hong, Zhong
    Chen, Donghuo
    ACM TRANSACTIONS ON CYBER-PHYSICAL SYSTEMS, 2017, 1 (02)
  • [26] Benchmarking and Performance Modeling of Event-Based Systems
    Kounev, Samuel
    Sachs, Kai
    IT-INFORMATION TECHNOLOGY, 2009, 51 (05): : 262 - 269
  • [27] Exploiting user feedback for online filtering in event-based systems
    Petroni, Fabio
    Querzoni, Leonardo
    Beraldi, Roberto
    Paolucci, Mario
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 71 : 202 - 211
  • [28] Rediscovering workflow models from event-based data using little thumb
    Weijters, AJMM
    van der Aalst, WMP
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2003, 10 (02) : 151 - 162
  • [29] Online Modifications for Event-Based Signal Temporal Logic Specifications
    Gundanaand, David
    Kress-Gazit, Hadas
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (08): : 6864 - 6871
  • [30] Towards Proactive Policies supporting Event-based Task Delegation
    Gaaloul, Khaled
    Miseldine, Philip
    Charoy, Fracois
    2009 THIRD INTERNATIONAL CONFERENCE ON EMERGING SECURITY INFORMATION, SYSTEMS, AND TECHNOLOGIES, 2009, : 99 - +