Performance Characterization and Evaluation of HPC Algorithms on Dissimilar Multicore Architectures

被引:0
|
作者
Krishnan, S. P. T. [1 ]
Veeravalli, Bharadwaj [2 ]
机构
[1] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore 138632, Singapore
[2] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117583, Singapore
关键词
RNA SECONDARY STRUCTURE; PARALLEL GENETIC ALGORITHM; STRUCTURE PREDICTION; PSEUDOKNOTS; IMPLEMENTATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we share our experiences in using two important yet different High Performance Computing (HPC) architectures for evaluating two HPC algorithms. The first architecture is an Intel x64 ISA based homogenous multicore with Uniform Memory Access (UMA) type shared-memory based Symmetric Multi-Processing system. The second architecture is an IBM Power ISA based heterogenous multicore with Non-Uniform Memory Access (NUMA) based distributed-memory Asymmetric Multi-Processing system. The two HPC algorithms are for predicting biological molecular structures, specifically the RNA secondary structures. The first algorithm that we created is a parallelized version of a popular serial RNA secondary structure prediction algorithm called PKNOTS. The second algorithm is a new parallel-by-design algorithm that we have developed called MARSs. Using real Ribo-Nucleic Acid (RNA) sequences, we conducted large-scale experiments involving hundreds of sequences using the above two algorithms. Based on thousands of data points that we collected as an outcome of our experiments, we report on the observed performance metrics for both the algorithms on the two architectures. Through our experiments, we infer that architectures with specialized co-processors for number-crunching along with high-speed memory bus and dedicated bus controllers generally perform better than general-purpose multi-processor architectures. In addition, we observed that algorithms that are intrinsically parallelized by design are able to scale & perform better by taking advantage of the underlying parallel architecture. We further share best practices on handling scalability aspects with regards to workload size. We believe our results are applicable to other HPC applications on similar HPC architectures.
引用
收藏
页码:1288 / 1295
页数:8
相关论文
共 50 条
  • [1] Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures
    Mallon, Damian A.
    Taboada, Guillermo L.
    Teijeiro, Carlos
    Tourino, Juan
    Fraguela, Basilio B.
    Gomez, Andres
    Doallo, Ramon
    Carlos Mourino, J.
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2009, 5759 : 174 - +
  • [2] Performance and Energy Efficiency Evaluation for HPC Applications in Heterogeneous Architectures
    Kloh, Vinicius
    Yokoyama, Daniel
    Yokoyama, Andre
    Silva, Gabrieli
    Ferro, Mariza
    Schulze, Bruno
    2018 SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS (WSCAD 2018), 2018, : 162 - 169
  • [3] Performance Evaluation of the LBM Solver Musubi on Various HPC Architectures
    Qi, Jiaxing
    Jain, Kartik
    Klimiach, Harald
    Roller, Sabine
    PARALLEL COMPUTING: ON THE ROAD TO EXASCALE, 2016, 27 : 807 - 816
  • [4] Performance Characterisation and Evaluation of WRF Model on Cloud and HPC Architectures
    Krishnan, S. P. T.
    Veeravalli, Bharadwaj
    Krishna, Vetharenian Hari
    Sheng, Wu Chia
    2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS), 2014, : 1280 - 1287
  • [5] Evaluation of SuperLU on multicore architectures
    Li, Xiaoye S.
    SCIDAC 2008: SCIENTIFIC DISCOVERY THROUGH ADVANCED COMPUTING, 2008, 125
  • [6] High-performance parallel implementations of flow accumulation algorithms for multicore architectures
    Kotyra, Bartlomiej
    Chabudzinski, Lukasz
    Stpiczynski, Przemyslaw
    COMPUTERS & GEOSCIENCES, 2021, 151
  • [7] Adaptive parallel interval branch and bound algorithms based on their performance for multicore architectures
    J. F. Sanjuan-Estrada
    L. G. Casado
    I. García
    The Journal of Supercomputing, 2011, 58 : 376 - 384
  • [8] Adaptive parallel interval branch and bound algorithms based on their performance for multicore architectures
    Sanjuan-Estrada, J. F.
    Casado, L. G.
    Garcia, I.
    JOURNAL OF SUPERCOMPUTING, 2011, 58 (03): : 376 - 384
  • [9] Performance modeling of emerging HPC architectures
    Bhatia, Nikhil
    Alam, Sadaf R.
    Vetter, Jeffrey S.
    PROCEEDINGS OF THE HPCMP USERS GROUP CONFERENCE 2006, 2006, : 367 - 373
  • [10] Appropriate allocation of workloads on performance asymmetric multicore architectures via deep learning algorithms
    Gomatheeshwari, B.
    Selvakumar, J.
    MICROPROCESSORS AND MICROSYSTEMS, 2020, 73