Acceleration of PDE-based FTLE Calculations on Intel Multi-core and Many-core Architectures

被引：0

作者：

Wang, Fang ^{[1
]}

Deng, Liang ^{[2
]}

Zhao, Dan ^{[2
]}

Li, Sikun ^{[2
]}

机构：

[1] Natl Univ Def Technol, Sch Comp, Changsha, Hunan, Peoples R China

[2] China Aerodynam Res & Dev Ctr, Computat Aerodynam Inst, Mianyang, Peoples R China

来源：

PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015) | 2015年

关键词：

finite-time Lyapunov exponent (FTLE); coherent structure; partial differential equation (PDE); Intel MIC; hardware performance metrics; LAGRANGIAN COHERENT STRUCTURES; TIME LYAPUNOV EXPONENTS; FLUID-FLOWS; VORTEX; IDENTIFICATION; DEFINITION;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Finite-time Lyapunov exponent (FTLE) iswidely used to extract coherent structure of unsteady flow. However, the calculation of FTLE can be highly time-consuming, which greatly limits the application's performance efficiency. In this paper, we accelerate a double precision PDE-based FTLE application for two-and three-dimensionalanalytical flow field on Intel multicore and many-core architectures such as Intel Sandy Bridge and Intel Many Integrated Core (MIC) coprocessor. Through analysis of the calculation processes of FTLE and the characteristics of Intelmulti-core and many-core architectures, we employ three categories of optimization techniques, namely, thread parallelism for multi-/many-core scaling, data parallelism to exploit SIMD (single-instruction multiple-data) mechanism and improving onchip data reuse, to maximize the performance. Also, the hardware performance metrics through an open source performance analysis tool, in order to explain performance difference between Sandy Bridge and MIC, are discussed. The experiment results show that our MIC-enabled FTLE achieves about 1.8x speed-ups relative to a parallel computation on two Intel Sandy Bridge CPUs, and perfect parallel efficiency is also observed from the experiment results.

引用

页码：178 / 183

页数：6

共 50 条

[1] Revision of Relational Joins for Multi-Core and Many-Core Architectures
Krulis, Martin
Yaghob, Jakub
DATESO 2011: DATABASES, TEXTS, SPECIFICATIONS, OBJECTS, 2011, 706 : 229 - 240
[2] Solving Matrix Equations on Multi-Core and Many-Core Architectures
Benner, Peter
Ezzatti, Pablo
Mena, Hermann
Quintana-Orti, Enrique S.
Remon, Alfredo
ALGORITHMS, 2013, 6 (04) : 857 - 870
[3] RTL Test Generation on Multi-Core and Many-Core Architectures
Varadarajan, Aravind Krishnan
Hsiao, Michael S.
2019 32ND INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2019 18TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID), 2019, : 100 - 105
[4] A Pattern-Based SpGEMM Library for Multi-Core and Many-Core Architectures
Xie, Zhen
Tan, Guangming
Liu, Weifeng
Sun, Ninghui
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (01) : 159 - 175
[5] Finite element assembly strategies on multi-core and many-core architectures
Markall, G. R.
Slemmer, A.
Ham, D. A.
Kelly, P. H. J.
Cantwell, C. D.
Sherwin, S. J.
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, 2013, 71 (01) : 80 - 97
[6] Scaling and Analyzing the Stencil Performance on Multi-Core and Many-Core Architectures
Gan, Lin
Fu, Haohuan
Xue, Wei
Xu, Yangtong
Yang, Chao
Wang, Xinliang
Lv, Zihong
You, Yang
Yang, Guangwen
Ou, Kaijian
2014 20TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2014, : 103 - 110
[7] Parallel Subspace Clustering Using Multi-core and Many-core Architectures
Datta, Amitava
Kaur, Amardeep
Lauer, Tobias
Chabbouh, Sami
NEW TRENDS IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2017, 2017, 767 : 213 - 223
[8] Fast parallel beam propagation method based on multi-core and many-core architectures
Shaaban, Adel
Sayed, M.
Hameed, Mohamed Farhat O.
Saleh, Hassan, I
Gomaa, L. R.
Du, Yi-Chun
Obayya, S. S. A.
OPTIK, 2019, 180 : 484 - 491
[9] A Fast and Scalable Graph Coloring Algorithm for Multi-core and Many-core Architectures
Rokos, Georgios
Gorman, Gerard
Kelly, Paul H. J.
EURO-PAR 2015: PARALLEL PROCESSING, 2015, 9233 : 414 - 425
[10] Portability with efficiency of the advection of BRAMS between multi-core and many-core architectures
Silva Junior, Manoel Baptista
Panetta, Jairo
Stephany, Stephan
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (22):

← 1 2 3 4 5 →