LARA: Locality-aware resource allocation to improve GPU memory-access time

被引:0
|
作者
Hossein BiTalebi
Farshad Safaei
机构
[1] Shahid Beheshti University,Faculty of Computer Science and Engineering
来源
关键词
Cache contention; Memory divergence; Graphics Processing Unit (GPU); GPU-NoC; Interconnection network; Locality; Memory; Priority; Row access; Stall time;
D O I
暂无
中图分类号
学科分类号
摘要
Memory access as a primary performance bottleneck of each processing unit also plays a significant role in GPU performance. In addition to high challenging parts of GPU’s memory access path, the low locality property among the requests considerably increases the memory access delay. Despite the GPU’s immense processing power, they cannot reach their maximum throughput values because of the memory access bottlenecks. Memory divergence and miss locality among the L1 missed requests significantly impose the Last-Level-Cache contention and main memory row switching overheads. In addition, interconnection network routes the request packets regardless of locality properties, such routing algorithm considerably disrupts the locality among the requests.
引用
收藏
页码:14438 / 14460
页数:22
相关论文
共 50 条
  • [11] Locality-aware Thread Block Design in Single and Multi-GPU Graph Processing
    Fan, Quan
    Chen, Zizhong
    2021 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2021, : 148 - 151
  • [12] Locality-aware allocation of multi-dimensional correlated files on the cloud platform
    Xiaofei Zhang
    Yongxin Tong
    Lei Chen
    Min Wang
    Shicong Feng
    Distributed and Parallel Databases, 2015, 33 : 353 - 380
  • [13] Integration of code scheduling, memory allocation, and array binding for memory-access optimization
    Kim, Taewhan
    Kim, Jungeun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2007, 26 (01) : 142 - 151
  • [14] Efficient and locality-aware resource management in wide-area distributed systems
    Shen, Haiying
    Li, Wing-Ning
    Zhu, Yingwu
    PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE, AND STORAGE, 2008, : 287 - +
  • [15] Locality-aware policies to improve job scheduling on 3D tori
    Jose A. Pascual
    Jose Miguel-Alonso
    Jose A. Lozano
    The Journal of Supercomputing, 2015, 71 : 966 - 994
  • [16] Locality-aware policies to improve job scheduling on 3D tori
    Pascual, Jose A.
    Miguel-Alonso, Jose
    Lozano, Jose A.
    JOURNAL OF SUPERCOMPUTING, 2015, 71 (03): : 966 - 994
  • [17] Tiresias: Optimizing NUMA Performance with CXL Memory and Locality-Aware Process Scheduling
    Tang, Wenda
    Ai, Tianxiang
    Wu, Jie
    PROCEEDINGS OF THE ACM TURING AWARD CELEBRATION CONFERENCE-CHINA 2024, ACM-TURC 2024, 2024, : 6 - 11
  • [18] LaSA: A Locality-aware Scheduling Algorithm for Hadoop-MapReduce Resource Assignment
    Chen, Tseng-Yi
    Wei, Hsin-Wen
    Wei, Ming-Feng
    Chen, Ying-Jie
    Hsu, Tsan-Sheng
    Shih, Wei-Kuan
    PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), 2013, : 342 - 346
  • [19] Locality-Aware Stencil Computations using Flash SSDs as Main Memory Extension
    Midorikawa, Hiroko
    Tan, Hideyuki
    2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 1163 - 1168
  • [20] A Locality-aware Cooperative Distributed Memory Caching for Parallel Data Analytic Applications
    Hung, Chia-Ting
    Chou, Jerry
    Chen, Ming-Hung
    Chung, I-Hsin
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 1111 - 1117