A two-level directory architecture for highly scalable cc-NUMA multiprocessors

被引：27

作者：

Acacio, ME

González, J

García, JM

Duato, J

机构：

[1] Univ Murcia, Dept Ingn & Tecnol Comp, Fac Informat, E-30071 Murcia, Spain

[2] Intel Labs Barcelona, Intel Barcelona Res Ctr, Barcelona 08034, Spain

[3] Univ Politecn Valencia, Dept Informat Sistemas & Comp, Valencia 46010, Spain

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2005年 / 16卷 / 01期

关键词：

scalability; directory memory overhead; two-level directory architecture; compressed sharing codes; unnecessary coherence messages; cc-NUMA multiprocessor;

D O I：

10.1109/TPDS.2005.4

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

One important issue the designer of a scalable shared-memory multiprocessor must deal with is the amount of extra memory required to store the directory information. It is desirable that the directory memory overhead be kept as low as possible, and that it scales very slowly with the size of the machine. Unfortunately, current directory architectures provide scalability at the expense of performance. This work presents a scalable directory architecture that significantly reduces the size of the directory for large-scale configurations of a multiprocessor without degrading performance. First, we propose multilayer clustering as an effective approach to reduce the width of directory entries. Based on this concept, we derive three new compressed sharing codes, some of them with a space complexity of O(log(2)(log(2)(N))) for an N-node system. Then, we present a novel two-level directory architecture to eliminate the penalty caused by compressed directories in general. The proposed organization consists of a small full-map first-level directory (which provides precise information for the most recently referenced lines) and a compressed second-level directory (which provides in-excess information for all the lines). The proposals are evaluated based on extensive execution-driven simulations (using RSIM) of a 64-node cc-NUMA multiprocessor. Results demonstrate that a system with a two-level directory architecture achieves the same performance as a multiprocessor with a big and nonscalable full-map directory, with a very significant reduction of the memory overhead.

引用

页码：67 / 79

页数：13

共 50 条

[21] Switch MSHR: A technique to reduce remote read memory access time in CC-NUMA multiprocessors
Bhuyan, LN
Wang, HJ
IEEE TRANSACTIONS ON COMPUTERS, 2003, 52 (05) : 617 - 632
[22] An evaluation of a commercial CC-NUMA architecture - The CONVEX exemplar SPP1200
Thekkath, R
Singh, AP
Singh, JP
John, S
Hennessy, J
11TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM, PROCEEDINGS, 1997, : 8 - 17
[23] PS-Dir: A Scalable Two-Level Directory Cache
Valls, Joan J.
Ros, Alberto
Sahuquillo, Julio
Gomez, Maria E.
Duato, Jose
PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'12), 2012, : 451 - 452
[24] Efficiency of remote access caches in future SMP-based CC-NUMA multiprocessors: Initial results
Moreno, ED
Kofuji, ST
THIRD INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS, AND NETWORKS, PROCEEDINGS (I-SPAN '97), 1997, : 190 - 197
[25] Scalability of Gaussian 03 on SGI Altix: The Importance of Data Locality on CC-NUMA Architecture
Gomperts, Roberto
Frisch, Michael
Panziera, Jean-Pierre
EVOLVING OPENMP IN AN AGE OF EXTREME PARALLELISM, 2009, 5568 : 93 - +
[26] A new scalable directory architecture for large-scale multiprocessors
Acacio, ME
González, J
García, JM
Duato, J
HPCA: SEVENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTING ARCHITECTURE, PROCEEDINGS, 2001, : 97 - 106
[27] Analysis of system performance by changing the ring architecture on the dual ring CC-NUMA system
Yun, JB
Jhang, ST
Jhon, CS
Lee, CW
NINTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 2002, : 103 - 108
[28] AN ENERGY-EFFICIENT TWO-LEVEL CACHE ARCHITECTURE FOR CHIP MULTIPROCESSORS
Lou, Mian
Wu, Longsheng
Shi, Senmao
Lu, Pengwei
2014 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT, 2014,
[29] NUDA: Non-Uniform Directory Architecture for Scalable Chip Multiprocessors
Shu, Wei
Tzeng, Nian-Feng
IEEE TRANSACTIONS ON COMPUTERS, 2018, 67 (05) : 740 - 747
[30] A novel lightweight directory architecture for scalable shared-memory multiprocessors
Ros, A
Acacio, ME
García, JM
EURO-PAR 2005 PARALLEL PROCESSING, PROCEEDINGS, 2005, 3648 : 582 - 591

← 1 2 3 4 5 →