Explicit Fourth-Order Runge–Kutta Method on Intel Xeon Phi Coprocessor

被引:0
|
作者
Beata Bylina
Joanna Potiopa
机构
[1] Maria Curie-Skłodowska University,Department of Computer Science
关键词
Intel Xeon Phi; Fourth-order Runge–Kutta method; CSR format; Intel Math Kernel Library (Intel MKL); SpMV; OpenMP;
D O I
暂无
中图分类号
学科分类号
摘要
This paper concerns an Intel Xeon Phi implementation of the explicit fourth-order Runge–Kutta method (RK4) for very sparse matrices with very short rows. Such matrices arise during Markovian modeling of computer and telecommunication networks. In this work an implementation based on Intel Math Kernel Library (Intel MKL) routines and the authors’ own implementation, both using the CSR storage scheme and working on Intel Xeon Phi, were investigated. The implementation based on the Intel MKL library uses the high-performance BLAS and Sparse BLAS routines. In our application we focus on OpenMP style programming. We implement SpMV operation and vector addition using the basic optimizing techniques and the vectorization. We evaluate our approach in native and offload modes for various number of cores and thread allocation affinities. Both implementations (based on Intel MKL and made by the authors) were compared in respect of the time, the speedup and the performance. The numerical experiments on Intel Xeon Phi show that the performance of authors’ implementation is very promising and gives a gain of up to two times compared to the multithreaded implementation (based on Intel MKL) running on CPU (Intel Xeon processor) and even three times in comparison with the application which uses Intel MKL on Intel Xeon Phi.
引用
收藏
页码:1073 / 1090
页数:17
相关论文
共 50 条
  • [1] Explicit Fourth-Order Runge-Kutta Method on Intel Xeon Phi Coprocessor
    Bylina, Beata
    Potiopa, Joanna
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2017, 45 (05) : 1073 - 1090
  • [2] An Explicit Fourth-Order Runge-Kutta Method for Dynamic Force Identification
    Lai, Tao
    Yi, Ting-Hua
    Li, Hong-Nan
    Fu, Xing
    INTERNATIONAL JOURNAL OF STRUCTURAL STABILITY AND DYNAMICS, 2017, 17 (10)
  • [3] Parallelizing a fourth-order Runge-Kutta method
    Tang, HC
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-III, PROCEEDINGS, 1997, : 806 - 810
  • [4] Behavior of MDynaMix on Intel Xeon Phi Coprocessor
    Valmiki, Manjunatha
    Kurkure, Nisha
    Das, Shweta
    Dinde, Prashant
    Deepu, C., V
    Misra, Goldi
    Sinha, Pradeep
    2013 FIRST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, MODELLING AND SIMULATION (AIMS 2013), 2013, : 387 - 392
  • [5] A New Fourth-Order Explicit Runge-Kutta Method for Solving First Order Ordinary Differential Equations
    Jikantoro, Yusuf Dauda
    Ismail, Fudziah
    Senu, Noraz
    PROCEEDINGS OF THE 20TH NATIONAL SYMPOSIUM ON MATHEMATICAL SCIENCES (SKSM20): RESEARCH IN MATHEMATICAL SCIENCES: A CATALYST FOR CREATIVITY AND INNOVATION, PTS A AND B, 2013, 1522 : 1003 - 1010
  • [6] Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor
    Lu, Mian
    Zhang, Lei
    Huynh Phung Huynh
    Ong, Zhongliang
    Liang, Yun
    He, Bingsheng
    Goh, Rick Siow Mong
    Richard Huynh
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [7] Effective Barrier Synchronization on Intel Xeon Phi Coprocessor
    Rodchenko, Andrey
    Nisbet, Andy
    Pop, Antoniu
    Lujan, Mikel
    EURO-PAR 2015: PARALLEL PROCESSING, 2015, 9233 : 588 - 600
  • [8] Performance Evaluation of R with Intel Xeon Phi Coprocessor
    El-Khamra, Yaakoub
    Gaffney, Niall
    Walling, David
    Wernert, Eric
    Xu, Weijia
    Zhang, Hui
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [9] Intel Xeon Phi Coprocessor High Performance Programming
    More, Andres
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2013, 13 (02): : 105 - 106
  • [10] Offload Compiler Runtime for the Intel® Xeon Phi™ Coprocessor
    Newburn, Chris J.
    Deodhar, Rajiv
    Dmitriev, Serguei
    Murty, Ravi
    Narayanaswamy, Ravi
    Wiegert, John
    Chinchilla, Francisco
    McGuire, Russell
    SUPERCOMPUTING (ISC 2013), 2013, 7905 : 239 - 254