Transitive Closure on the Cell Broadband Engine: A study on Self-Scheduling in a Multicore Processor

被引：0

作者：

Vinjamuri, Sudhir ^{[1
]}

Prasanna, Viktor K. ^{[1
]}

机构：

[1] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90007 USA

来源：

2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5 | 2009年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper we present a mapping methodology and optimizations for solving transitive closure on the Cell multicore processor Using our approach, it is possible to achieve near peak performance for transitive closure on the Cell processor We first parallelize the standard Floyd Warshall algorithm and show through analysis and experimental results that data communication is a bottleneck for performance and scalability. We parallelize a cache optimized version of Floyd Warshall algorithm to remove the memory bottleneck. As is the case with several scientific computing and industrial applications on a multicore processor, synchronization and scheduling of the cores plays a crucial role in determining the performance of this algorithm. We define a self-scheduling mechanism for the cores of a multicore processor and design a self-scheduler for Blocked Floyd Warshall algorithm on the Cell multicore processor to remove the scheduling bottleneck. We also present optimizations in scheduling order to remove synchronization points. Our implementations achieved up to 78GFLOPS.

引用

页码：999 / 1009

页数：11

共 50 条

[31] Performance of Static and Dynamic Task Scheduling for Real-Time Engine Control System on Embedded Multicore Processor
Oki, Yoshitake
Mikami, Hiroki
Nishida, Hikaru
Umeda, Dan
Kimura, Keiji
Kasahara, Hironori
LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2019, 2021, 11998 : 1 - 14
[32] Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters
Chao-Chin Wu
Lien-Fu Lai
Chao-Tung Yang
Po-Hsun Chiu
The Journal of Supercomputing, 2012, 60 : 31 - 61
[33] Performance-based parallel loop self-scheduling using hybrid OpenMP and MPI programming on multicore SMP clusters
Yang, Chao-Tung
Wu, Chao-Chin
Chang, Jen-Hsiang
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (08): : 721 - 744
[34] Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters
Wu, Chao-Chin
Lai, Lien-Fu
Yang, Chao-Tung
Chiu, Po-Hsun
JOURNAL OF SUPERCOMPUTING, 2012, 60 (01): : 31 - 61
[35] Implementation of a cone-beam backprojection algorithm on the Cell Broadband Engine processor
Bockenbach, Olivier
Knaup, Michael
Kachelriess, Marc
MEDICAL IMAGING 2007: PHYSICS OF MEDICAL IMAGING, PTS 1-3, 2007, 6510
[36] Circuit design techniques for a first-generation Cell Broadband Engine processor
Warnock, James
Wendel, Dieter
Aipperspach, Tony
Behnen, Erwin
Cordes, Robert A.
Dhong, Sang H.
Hirairi, Koji
Murakami, Hiroaki
Onishi, Shohji
Pham, Dac C.
Pille, Jurgen
Posluszny, Stephen D.
Takahashi, Osamu
Wen, Huajun
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2006, 41 (08) : 1692 - 1706
[37] Cell Broadband Engine processor performance optimization: Tracing tools implementation and use
Biberstein, M.
Dori-Hacohen, S.
Harel, Y.
Heilper, A.
Mendelson, B.
Shvadron, U.
Treister, E.
Turek, J.
Chang, M. S.
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2009, 53 (05)
[38] Acceleration of Finite Difference Time Domain Method using Cell Broadband Engine Processor
Watanabe, Shinya
Hashimoto, Osamu
2010 ASIA-PACIFIC MICROWAVE CONFERENCE, 2010, : 2161 - 2163
[39] Accelerating mutual-information-based linear registration on the cell broadband engine processor
Ohara, Moriyoshi
Yeo, Hangu
Savino, Frank
Iyengar, Giridharan
Gong, Leiguang
Inoue, Hiroshi
Komatsu, Hideaki
Sheinin, Vadim
Daijavad, Shahrokh
2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 272 - +
[40] Accelerating 3D nonrigid registration using the Cell Broadband Engine processor
Rohrer, J.
Gong, L.
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2009, 53 (05)

← 1 2 3 4 5 →