Thread Row Buffers: Improving Memory Performance Isolation and Throughput in Multiprogrammed Environments

被引:6
|
作者
Herrero, Enric [1 ,2 ]
Gonzalez, Jose [3 ]
Canal, Ramon [1 ]
Tullsen, Dean [4 ]
机构
[1] Univ Politecn Cataluna, Dept Arquitectura Comp, ES-08034 Barcelona, Spain
[2] Intel Labs Barcelona, Barcelona 08034, Spain
[3] Intel Labs Barcelona, Visual Comp Grp, Barcelona 08034, Spain
[4] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
关键词
Memory controllers; DRAM; thread row buffers; DRAM;
D O I
10.1109/TC.2012.173
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The widespread adoption of chip multiprocessors in recent years has increased the number of applications simultaneously accessing DRAM memories. Therefore, memory access patterns have also changed and this has reduced row buffer locality significantly, degrading performance and energy efficiency. Furthermore, concurrent execution of applications also has shown the need of performance isolation among threads in the memory controller to enforce a quality of service in virtualized environments. Existing DRAM memories, however, enforce a tradeoff between throughput and isolation. To solve these problems, this paper proposes the addition of Thread Row Buffers (TRBs) to DRAM memories. TRBs keep an active row per thread, thereby increasing DRAM efficiency by avoiding alternate accesses to a limited number of rows and allowing the implementation of a memory scheduler not bound to the throughput-isolation tradeoff. Thread Row Buffers with Service Partitioning (TRB-SP) increase the row hit-rate by 38 percent with respect to FR-FCFS and by 11 percent with respect to Cache DRAM. This, in turn, increases overall performance by 17 and 7 percent, respectively. TRB-SP is also able to reduce the standard deviation of the memory access time of an application by 40 percent over FR-FCFS, 31 percent over PAR-BS, and 42 percent over Cache DRAM.
引用
收藏
页码:1879 / 1892
页数:14
相关论文
共 5 条
  • [1] FASTLANE: Improving Performance of Software Transactional Memory for Low Thread Counts
    Wamhoff, Jons-Tobias
    Fetzer, Christof
    Felber, Pascal
    Riviere, Etienne
    Muller, Gilles
    ACM SIGPLAN NOTICES, 2013, 48 (08) : 113 - 122
  • [2] Managing application thread use - Improving performance on shared memory systems
    Akyil, Levent
    DR DOBBS JOURNAL, 2008, 33 (09): : 48 - +
  • [3] Improving the performance of Distributed Shared Memory environments on grid multiprocessors
    Dimitrelos, D
    Halatsis, C
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 159 - 162
  • [4] A Throughput Driven Task Scheduler for Improving MapReduce Performance in Job-intensive Environments
    Wang, Xite
    Shen, Derong
    Yu, Ge
    Nie, Tiezheng
    Kou, Yue
    2013 IEEE INTERNATIONAL CONGRESS ON BIG DATA, 2013, : 211 - 218
  • [5] JAR tool: Using document analysis for improving the throughput of high performance printing environments
    Kolberg, Mariana
    Fernandes, Luiz Gustavo
    Raeder, Mateus
    Fonseca, Carolina
    DocEng 2014 - Proceedings of the 2014 ACM Symposium on Document Engineering, 2014, : 175 - 178