New generalized data structures for matrices lead to a variety of high-performance algorithms

被引:0
|
作者
Gustavson, FG [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Heights, NY 10598 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe new data structures for full and packed storage of dense symmetric/triangular arrays that generalize both. Using the new data structures, one is led to several new algorithms that save "half" the storage and outperform the current blocked-based level-3 algorithms in LAPACK. We concentrate on the simplest forms of the new algorithms and show for Cholesky factorization they are a direct generalization of LINPACK. This means that level-3 BLAS's are not required to obtain level-3 performance. The replacement for level-3 BLAS are so-called kernel routines, and on IBM platforms they are producible from simple textbook type codes, by the XLF Fortran compiler. In the sequel I will label these "vanilla" codes. The results for Cholesky, on Power3 with a peak performance of 800 Mflop/s at n greater than or equal to 200 is over 720 MFlop/s and reaches 735 MFlop/s. Using conventional full-format LAPACK DPOTRF with ESSL BLAS's, one first gets 600 MFlop/s at n 600 and only reaches a peak of 620 MFlop/s. We have also produced simple square blocked full-matrix data formats where the blocks themselves are stored in column-major (Fortran) order or row-major (C) format. The simple algorithms of LU factorization with partial pivoting for this new data format is a direct generalization of LINPACK algorithm DGEFA. Again, no conventional level-3 BLAS's are required; the replacements are again so-called kernel routines, Programming far squared blocked full-matrix format can be accomplished in standard Fortran through the use of three- and four-dimensional arrays. Thus, no new compiler support is necessary. Finally we mention that other more complicated algorithms are possible, for example, recursive ones. The recursive algorithms are also easily programmed via the use of tables that address where the blocks are stored in the two-dimensional recursive block array.
引用
收藏
页码:46 / 61
页数:16
相关论文
共 50 条
  • [41] A family of high-performance matrix multiplication algorithms
    Gunnels, JA
    Gustavson, FG
    Henry, GM
    van de Geijn, RA
    APPLIED PARALLEL COMPUTING: STATE OF THE ART IN SCIENTIFIC COMPUTING, 2006, 3732 : 256 - 265
  • [42] HIGH-PERFORMANCE SYNCHRONIZATION ALGORITHMS FOR MULTIPROGRAMMED MULTIPROCESSORS
    WISNIEWSKI, RW
    KONTOTHANASSIS, LI
    SCOTT, ML
    SIGPLAN NOTICES, 1995, 30 (08): : 199 - 206
  • [43] High-performance exact algorithms for motif search
    Rajasekaran S.
    Balla S.
    Huang C.-H.
    Thapar V.
    Gryk M.
    Maciejewski M.
    Schiller M.
    Journal of Clinical Monitoring and Computing, 2005, 19 (4-5) : 319 - 328
  • [44] High-performance steel structures for buildings
    Li, G. Q.
    Jin, H. J.
    Wang, H. J.
    Pang, M. D.
    SUSTAINABLE BUILDINGS AND STRUCTURES, 2016, : 3 - 11
  • [45] Sustainable High-Performance Resilient Structures
    Hao, Hong
    Li, Jun
    ENGINEERING, 2019, 5 (02) : 197 - 198
  • [46] High-performance concrete for containment structures
    Ray, Indrajit
    Chakraborty, Arun Kr.
    Sengupta, Bratish
    Nuclear Engineering and Design, 2006, 236 (10) : 1041 - 1048
  • [47] A high-performance generalized discontinuous PWM algorithm
    Hava, AM
    Kerkman, RJ
    Lipo, TA
    IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS, 1998, 34 (05) : 1059 - 1071
  • [48] High-Performance Computing based Scalable Online Fuzzy Clustering Algorithms for Big Data
    Jha, Preeti
    Tiwari, Aruna
    Bharill, Neha
    Ratnaparkhe, Milind
    Patel, Om Prakash
    Pulakitha, Rapolu
    Chauhan, Aditi
    2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 1400 - 1407
  • [49] Evaluation on high-performance image compaction algorithms in spatio-temporal data processing
    Li, Guozhang
    Xing, Kongduo
    Alfred, Rayner
    Wang, Yetong
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2024, 18 (04): : 2885 - 2899
  • [50] TeksDB: Weaving Data Structures for a High-Performance Key-Value Store
    Han, Youil
    Kim, Bryan S.
    Yeon, Jeseong
    Lee, Sungjin
    Lee, Eunji
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2019, 3 (01)