A two-level directory architecture for highly scalable cc-NUMA multiprocessors

被引:27
|
作者
Acacio, ME
González, J
García, JM
Duato, J
机构
[1] Univ Murcia, Dept Ingn & Tecnol Comp, Fac Informat, E-30071 Murcia, Spain
[2] Intel Labs Barcelona, Intel Barcelona Res Ctr, Barcelona 08034, Spain
[3] Univ Politecn Valencia, Dept Informat Sistemas & Comp, Valencia 46010, Spain
关键词
scalability; directory memory overhead; two-level directory architecture; compressed sharing codes; unnecessary coherence messages; cc-NUMA multiprocessor;
D O I
10.1109/TPDS.2005.4
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
One important issue the designer of a scalable shared-memory multiprocessor must deal with is the amount of extra memory required to store the directory information. It is desirable that the directory memory overhead be kept as low as possible, and that it scales very slowly with the size of the machine. Unfortunately, current directory architectures provide scalability at the expense of performance. This work presents a scalable directory architecture that significantly reduces the size of the directory for large-scale configurations of a multiprocessor without degrading performance. First, we propose multilayer clustering as an effective approach to reduce the width of directory entries. Based on this concept, we derive three new compressed sharing codes, some of them with a space complexity of O(log(2)(log(2)(N))) for an N-node system. Then, we present a novel two-level directory architecture to eliminate the penalty caused by compressed directories in general. The proposed organization consists of a small full-map first-level directory (which provides precise information for the most recently referenced lines) and a compressed second-level directory (which provides in-excess information for all the lines). The proposals are evaluated based on extensive execution-driven simulations (using RSIM) of a 64-node cc-NUMA multiprocessor. Results demonstrate that a system with a two-level directory architecture achieves the same performance as a multiprocessor with a big and nonscalable full-map directory, with a very significant reduction of the memory overhead.
引用
收藏
页码:67 / 79
页数:13
相关论文
共 50 条
  • [21] Switch MSHR: A technique to reduce remote read memory access time in CC-NUMA multiprocessors
    Bhuyan, LN
    Wang, HJ
    IEEE TRANSACTIONS ON COMPUTERS, 2003, 52 (05) : 617 - 632
  • [22] An evaluation of a commercial CC-NUMA architecture - The CONVEX exemplar SPP1200
    Thekkath, R
    Singh, AP
    Singh, JP
    John, S
    Hennessy, J
    11TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM, PROCEEDINGS, 1997, : 8 - 17
  • [23] PS-Dir: A Scalable Two-Level Directory Cache
    Valls, Joan J.
    Ros, Alberto
    Sahuquillo, Julio
    Gomez, Maria E.
    Duato, Jose
    PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'12), 2012, : 451 - 452
  • [24] Efficiency of remote access caches in future SMP-based CC-NUMA multiprocessors: Initial results
    Moreno, ED
    Kofuji, ST
    THIRD INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS, AND NETWORKS, PROCEEDINGS (I-SPAN '97), 1997, : 190 - 197
  • [25] Scalability of Gaussian 03 on SGI Altix: The Importance of Data Locality on CC-NUMA Architecture
    Gomperts, Roberto
    Frisch, Michael
    Panziera, Jean-Pierre
    EVOLVING OPENMP IN AN AGE OF EXTREME PARALLELISM, 2009, 5568 : 93 - +
  • [26] A new scalable directory architecture for large-scale multiprocessors
    Acacio, ME
    González, J
    García, JM
    Duato, J
    HPCA: SEVENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTING ARCHITECTURE, PROCEEDINGS, 2001, : 97 - 106
  • [27] Analysis of system performance by changing the ring architecture on the dual ring CC-NUMA system
    Yun, JB
    Jhang, ST
    Jhon, CS
    Lee, CW
    NINTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 2002, : 103 - 108
  • [28] AN ENERGY-EFFICIENT TWO-LEVEL CACHE ARCHITECTURE FOR CHIP MULTIPROCESSORS
    Lou, Mian
    Wu, Longsheng
    Shi, Senmao
    Lu, Pengwei
    2014 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT, 2014,
  • [29] NUDA: Non-Uniform Directory Architecture for Scalable Chip Multiprocessors
    Shu, Wei
    Tzeng, Nian-Feng
    IEEE TRANSACTIONS ON COMPUTERS, 2018, 67 (05) : 740 - 747
  • [30] A novel lightweight directory architecture for scalable shared-memory multiprocessors
    Ros, A
    Acacio, ME
    García, JM
    EURO-PAR 2005 PARALLEL PROCESSING, PROCEEDINGS, 2005, 3648 : 582 - 591