Exploring parallel multi-GPU local search strategies in a metaheuristic framework

被引：23

作者：

Rios, Eyder ^{[1
,2
]}

Ochi, Luiz Satoru ^{[2
]}

Boeres, Cristina ^{[2
]}

Coelho, Vitor N. ^{[2
]}

Coelho, Igor M. ^{[3
]}

Farias, Ricardo ^{[4
]}

机构：

[1] Univ Estadual Piaui UESPI, Parnaiba, PI, Brazil

[2] Univ Fed Fluminense, Inst Comp, Niteroi, RJ, Brazil

[3] Univ Estado Rio De Janeiro, Rio De Janeiro, RJ, Brazil

[4] Univ Fed Rio de Janeiro, COPPE Sistemas, Rio de Janeiro, RJ, Brazil

来源：

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING | 2018年 / 111卷

关键词：

Multi-GPU; Parallel metaheuristic; Local search; Minimum latency problem; VND; GRASP; ILS; COMBINATORIAL OPTIMIZATION; TRAVELING SALESMAN; IMPLEMENTATION; ALGORITHM;

D O I：

10.1016/j.jpdc.2017.06.011

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Optimization tasks are often complex, CPU-time consuming and usually deal with finding the best (or good enough) solution among alternatives for a given problem. Parallel metaheuristics have been used in many real-world and scientific applications to efficiently solve these kind of problems. Local Search (LS) is an essential component for some metaheuristics and, very often, represents the dominant computational effort accomplished by an algorithm. Several metaheuristic approaches try to adapt traditional LS models to parallel platforms without considering the intrinsic features of the available architectures. In this work, we present a novel local search strategy, so-called Distributed Variable Neighborhood Descent (DVND), specially designed for CPU and multi-GPU environment. Furthermore, a new neighborhood search strategy, so-called Multi Improvement, is introduced, taking advantage of GPU massive parallelism in order to boost up LS procedures. A hard combinatorial problem is considered as case of study, the Minimum Latency Problem (MLP). For tackling this problem, a hybrid metaheuristic algorithm is considered, which combines good quality initial solutions, generated by a Greedy Randomized Adaptive Search Procedures, with a flexible and powerful refinement procedure, inside the scope of an Iterated Local Search. The DVND was compared to the classic local search procedures, producing results that outperformed the best known sequential algorithm found in the literature. The speedups ranged from 7.3 to 13.7, for the larger MLP instances with 500 to 1000 clients. Results demonstrate the effectiveness of the proposed techniques in terms of solution quality, performance and scalability. (C) 2017 Elsevier Inc. All rights reserved.

引用

页码：39 / 55

页数：17

共 50 条

[21] An efficient parallel collaborative filtering algorithm on multi-GPU platform
Zhongya Wang
Ying Liu
Steve Chiu
The Journal of Supercomputing, 2016, 72 : 2080 - 2094
[22] A Massively Parallel and Scalable Multi-GPU Material Point Method
Wang, Xinlei
Qiu, Yuxing
Slattery, Stuart R.
Fang, Yu
Li, Minchen
Zhu, Song-Chun
Zhu, Yixin
Tang, Min
Manocha, Dinesh
Jiang, Chenfanfu
ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (04):
[23] Performance Analysis of Parallel FFT on Large Multi-GPU Systems
Ayala, Alan
Tomov, Stan
Stoyanov, Miroslav
Haidar, Azzam
Dongarra, Jack
2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 372 - 381
[24] HPSM: A Programming Framework for Multi-CPU and Multi-GPU Systems
Lima, Joao V. F.
Di Domenico, Daniel
2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 31 - 36
[25] Exploring the Multitude of Real-Time Multi-GPU Configurations
Elliott, Glenn A.
Anderson, James H.
2014 IEEE 35TH REAL-TIME SYSTEMS SYMPOSIUM (RTSS 2014), 2014, : 260 - 271
[26] Multi-CPU/Multi-GPU Based Framework for Multimedia Processing
Mahmoudi, Sidi Ahmed
Manneback, Pierre
COMPUTER SCIENCE AND ITS APPLICATIONS, CIIA 2015, 2015, 456 : 54 - 65
[27] Distributed Multi-GPU Accelerated Hybrid Parallel Rendering for Massively Parallel Environment
Cao, Yi
Wang, Huawei
Ai, Zhiwei
2014 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV2014), 2014, : 30 - 36
[28] PARTANS: An Autotuning Framework for Stencil Computation on Multi-GPU Systems
Lutz, Thibaut
Fensch, Christian
Cole, Murray
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 9 (04)
[29] Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
Pal, Saptadeep
Ebrahimi, Eiman
Zulfiqar, Arslan
Fu, Yaosheng
Zhang, Victor
Migacz, Szymon
Nellans, David
Gupta, Puneet
IEEE MICRO, 2019, 39 (05) : 91 - 101
[30] A Multi-GPU Framework for In-Memory Text Data Analytics
Chong, Poh Kit
Karuppiah, Ettikan K.
Yong, Keh Kok
2013 IEEE 27TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2013, : 1411 - 1416

← 1 2 3 4 5 →