Transitive Closure on the Cell Broadband Engine: A study on Self-Scheduling in a Multicore Processor

被引:0
|
作者
Vinjamuri, Sudhir [1 ]
Prasanna, Viktor K. [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90007 USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we present a mapping methodology and optimizations for solving transitive closure on the Cell multicore processor Using our approach, it is possible to achieve near peak performance for transitive closure on the Cell processor We first parallelize the standard Floyd Warshall algorithm and show through analysis and experimental results that data communication is a bottleneck for performance and scalability. We parallelize a cache optimized version of Floyd Warshall algorithm to remove the memory bottleneck. As is the case with several scientific computing and industrial applications on a multicore processor, synchronization and scheduling of the cores plays a crucial role in determining the performance of this algorithm. We define a self-scheduling mechanism for the cores of a multicore processor and design a self-scheduler for Blocked Floyd Warshall algorithm on the Cell multicore processor to remove the scheduling bottleneck. We also present optimizations in scheduling order to remove synchronization points. Our implementations achieved up to 78GFLOPS.
引用
收藏
页码:999 / 1009
页数:11
相关论文
共 50 条
  • [11] Performance-Based Parallel Loop Self-scheduling on Heterogeneous Multicore PC Clusters
    Yang, Chao-Tung
    Chang, Jen-Hsiang
    Wu, Chao-Chin
    HIGH PERFORMANCE COMPUTING AND APPLICATIONS, 2010, 5938 : 509 - +
  • [12] Performance Evaluation of Convolution on the Cell Broadband Engine Processor
    Ismail, Leila
    Guerchi, Driss
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2011, 22 (02) : 337 - 351
  • [13] Cell broadband engine processor vault security architecture
    Shimizu, K.
    Hofstee, H. P.
    Liberty, J. S.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2007, 51 (05) : 521 - 528
  • [14] Parallel exact inference on the Cell Broadband Engine processor
    Xia, Yinglong
    Prasanna, Viktor K.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2010, 70 (05) : 558 - 572
  • [15] Cell broadband engine processor vault security architecture
    Shimizu, Kanna
    Hofstee, H. Peter
    Liberty, John S.
    IBM Journal of Research and Development, 2007, 51 (05): : 521 - 528
  • [16] OPENCV IMPLEMENTATION OPTIMIZED FOR A CELL BROADBAND ENGINE PROCESSOR
    Sugano, Hiroki
    Miyamoto, Ryusuke
    2009 IEEE 13TH DIGITAL SIGNAL PROCESSING WORKSHOP & 5TH IEEE PROCESSING EDUCATION WORKSHOP, VOLS 1 AND 2, PROCEEDINGS, 2009, : 182 - +
  • [17] Speech recognition systems on the cell broadband engine processor
    Liu, Yang
    Jones, Holger
    Vaidya, Sheila
    Perrone, Michael P.
    Tydlitát, Bořivoj
    Nanda, Ashwini K.
    IBM Journal of Research and Development, 2007, 51 (05): : 583 - 591
  • [18] MPI microtask for programming the Cell Broadband Engine™ processor
    Ohara, M
    Inoue, H
    Sohda, Y
    Komatsu, H
    Nakatani, T
    IBM SYSTEMS JOURNAL, 2006, 45 (01) : 85 - 102
  • [19] Speech recognition systems on the cell broadband engine processor
    Liu, Y.
    Jones, H.
    Vaidya, S.
    Perrone, M.
    Tydlitat, B.
    Nanda, A. K.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2007, 51 (05) : 583 - 591
  • [20] Parallel Exact Inference on the Cell Broadband Engine Processor
    Xia, Yinglong
    Prasanna, Viktor K.
    INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2008, : 545 - +