Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor

被引：23

作者：

Szustak, Lukasz ^{[1
]}

Rojek, Krzysztof ^{[1
]}

Olas, Tomasz ^{[1
]}

Kuczynski, Lukasz ^{[1
]}

Halbiniak, Kamil ^{[1
]}

Gepner, Pawel ^{[2
]}

机构：

[1] Czestochowa Tech Univ, Czestochowa, Poland

[2] Intel Corp, Swindon SN3 1RJ, Wilts, England

来源：

SCIENTIFIC PROGRAMMING | 2015年 / 2015卷

关键词：

ADVECTION TRANSPORT ALGORITHM; PARALLELIZATION; PERFORMANCE; ARCHITECTURES;

D O I：

10.1155/2015/642705

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

The multidimensional positive definite advection transport algorithm (MPDATA) belongs to the group of nonoscillatory forward-in-time algorithms and performs a sequence of stencil computations. MPDATA is one of the major parts of the dynamic core of the EULAG geophysical model. In this work, we outline an approach to adaptation of the 3D MPDATA algorithm to the Intel MIC architecture. In order to utilize available computing resources, we propose the (3 + 1) D decomposition of MPDATA heterogeneous stencil computations. This approach is based on combination of the loop tiling and fusion techniques. It allows us to ease memory/communication bounds and better exploit the theoretical floating point efficiency of target computing platforms. An importantmethod of improving the efficiency of the (3 + 1) D decomposition is partitioning of available cores/threads into work teams. It permits for reducing inter-cache communication overheads. This method also increases opportunities for the efficient distribution of MPDATA computation onto available resources of the Intel MIC architecture, as well as Intel CPUs. We discuss preliminary performance results obtained on two hybrid platforms, containing two CPUs and Intel Xeon Phi. The top-of-the-line Intel Xeon Phi 7120P gives the best performance results, and executes MPDATA almost 2 times faster than two Intel Xeon E5-2697v2 CPUs.

引用

页数：14

共 50 条

[1] High Performance Stencil Computations for Intel® Xeon Phi™ Coprocessor
Feng, Luxia
Dong, Yushan
Li, Chunjiang
Jiang, Hao
ADVANCED COMPUTER ARCHITECTURE, ACA 2016, 2016, 626 : 108 - 117
[2] Using Intel Xeon Phi Coprocessor to Accelerate Computations in MPDATA Algorithm
Szustak, Lukasz
Rojek, Krzysztof
Gepner, Pawel
PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2013), PT I, 2014, 8384 : 582 - 592
[3] Evaluation of 3-D Stencil Codes on the Intel Xeon Phi Coprocessor
Hernandez, Mario
Cebrian, Juan M.
Cecilia, Jose M.
Garcia, Jose M.
PARALLEL COMPUTING: ON THE ROAD TO EXASCALE, 2016, 27 : 197 - 206
[4] Behavior of MDynaMix on Intel Xeon Phi Coprocessor
Valmiki, Manjunatha
Kurkure, Nisha
Das, Shweta
Dinde, Prashant
Deepu, C., V
Misra, Goldi
Sinha, Pradeep
2013 FIRST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, MODELLING AND SIMULATION (AIMS 2013), 2013, : 387 - 392
[5] Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor
Lu, Mian
Zhang, Lei
Huynh Phung Huynh
Ong, Zhongliang
Liang, Yun
He, Bingsheng
Goh, Rick Siow Mong
Richard Huynh
2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
[6] Effective Barrier Synchronization on Intel Xeon Phi Coprocessor
Rodchenko, Andrey
Nisbet, Andy
Pop, Antoniu
Lujan, Mikel
EURO-PAR 2015: PARALLEL PROCESSING, 2015, 9233 : 588 - 600
[7] Performance Evaluation of R with Intel Xeon Phi Coprocessor
El-Khamra, Yaakoub
Gaffney, Niall
Walling, David
Wernert, Eric
Xu, Weijia
Zhang, Hui
2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
[8] Intel Xeon Phi Coprocessor High Performance Programming
More, Andres
JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2013, 13 (02): : 105 - 106
[9] Offload Compiler Runtime for the Intel® Xeon Phi™ Coprocessor
Newburn, Chris J.
Deodhar, Rajiv
Dmitriev, Serguei
Murty, Ravi
Narayanaswamy, Ravi
Wiegert, John
Chinchilla, Francisco
McGuire, Russell
SUPERCOMPUTING (ISC 2013), 2013, 7905 : 239 - 254
[10] Intel® Xeon Phi™ coprocessor (codename Knights Corner)
Chrysos, George
2012 IEEE HOT CHIPS 24 SYMPOSIUM (HCS), 2012,

← 1 2 3 4 5 →