ISOLATING COSTS IN SHARED MEMORY COMMUNICATION BUFFERING

被引:0
|
作者
Byna, Surendra [1 ]
Cameron, Kirk W. [2 ]
Sun, Xian-He [1 ]
机构
[1] IIT, Dept Compuer Sci, Chicago, IL 60616 USA
[2] Univ South Carolina, Dept Comp Sci & Engn, Columbia, SC 29208 USA
关键词
Memory communication; Communication performance; buffering;
D O I
10.1142/S0129626405002271
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Communication in parallel applications is a combination of data transfers internally at a source or destination and across the network. Previous research focused on quantifying network transfer costs has indirectly resulted in reduced overall communication cost. Optimized data transfer from source memory to the network interface has received less attention. In shared memory systems, such memory-to-memory transfers dominate communication cost. In distributed memory systems, memory-to-network interface transfers grow in significance as processor and network speeds increase at faster rates than memory latency speeds. Our objective is to minimize the cost of internal data transfers. The following examples illustrating the impact of memory transfers on communication, we present a methodology for classifying the effects of data size and data distribution on hardware, middleware, and application software performance. This cost is quantified using hardware counter event measurements on the SGI Origin 2000. For the SGI 02K, we empirically identify the cost caused by just copying data from one buffer to another and the middleware overhead. We use MPICH in our experiments, but our techniques are generally applicable to any communication implementation.
引用
收藏
页码:357 / 365
页数:9
相关论文
共 50 条
  • [1] SYNCHRONIZATION AND COMMUNICATION COSTS OF LOOP PARTITIONING ON SHARED-MEMORY MULTIPROCESSOR SYSTEMS
    GUPTA, R
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1992, 3 (04) : 505 - 512
  • [2] SYNCHRONIZATION AND COMMUNICATION COSTS OF LOOP PARTITIONING ON SHARED-MEMORY MULTIPROCESSOR SYSTEMS
    GUPTA, R
    PROCEEDINGS OF THE 1989 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, VOL 2: SOFTWARE, 1989, : 23 - 30
  • [3] Data Communication in Multiprocessors: Shared and Fragmented Memory
    Jordan, Harry F.
    IT - Information Technology, 1988, 30 (02): : 129 - 138
  • [4] Enabling shared memory communication in networks of MPSoCs
    Lant, Joshua
    Concatto, Caroline
    Attwood, Andrew
    Pascual, Jose A.
    Ashworth, Mike
    Navaridas, Javier
    Lujan, Mikel
    Goodacre, John
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (21):
  • [5] Shared-Memory Communication for Containerized Workflows
    Hobson, Tanner
    Yildiz, Orcun
    Nicolae, Bogdan
    Huang, Jian
    Peterka, Tom
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 123 - 132
  • [6] Power and Cost Reduction by Hybrid Optical Packet Switching with Shared Memory Buffering
    Rhee, June-Koo Kevin
    Lee, Chan-Kyun
    Kim, Ji-Hwan
    Won, Yong-Hyub
    Choi, Jin Seek
    Choi, JungYul
    IEEE COMMUNICATIONS MAGAZINE, 2011, 49 (05) : 102 - 110
  • [7] On the Difference Between Shared Memory and Shared Address Space in HPC Communication
    Hori, Atsushi
    Ouyang, Kaiming
    Gerofi, Balazs
    Ishikawa, Yutaka
    SUPERCOMPUTING FRONTIERS, SCFA 2022, 2022, 13214 : 59 - 78
  • [8] Memory-Based Communication Facilities and asymmetric Distributed Shared Memory
    Matsumoto, T
    Hiraki, K
    INNOVATIVE ARCHITECTURE FOR FUTURE GENERATION HIGH-PERFORMANCE PROCESSORS AND SYSTEMS, PROCEEDINGS, 1998, : 30 - 39
  • [9] Towards Optimizing Energy Costs of Algorithms for Shared Memory Architectures
    Korthikanti, Vijay Anand
    Agha, Gul
    SPAA '10: PROCEEDINGS OF THE TWENTY-SECOND ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2010, : 157 - 165
  • [10] Communication in Shared Memory: Concepts, Definitions, and Efficient Detection
    Diener, Matthias
    Cruz, Eduardo H. M.
    Alves, Marco A. Z.
    Navaux, Philippe O. A.
    2016 24TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP), 2016, : 151 - 158