LARA: Locality-aware resource allocation to improve GPU memory-access time

被引:0
|
作者
Hossein BiTalebi
Farshad Safaei
机构
[1] Shahid Beheshti University,Faculty of Computer Science and Engineering
来源
关键词
Cache contention; Memory divergence; Graphics Processing Unit (GPU); GPU-NoC; Interconnection network; Locality; Memory; Priority; Row access; Stall time;
D O I
暂无
中图分类号
学科分类号
摘要
Memory access as a primary performance bottleneck of each processing unit also plays a significant role in GPU performance. In addition to high challenging parts of GPU’s memory access path, the low locality property among the requests considerably increases the memory access delay. Despite the GPU’s immense processing power, they cannot reach their maximum throughput values because of the memory access bottlenecks. Memory divergence and miss locality among the L1 missed requests significantly impose the Last-Level-Cache contention and main memory row switching overheads. In addition, interconnection network routes the request packets regardless of locality properties, such routing algorithm considerably disrupts the locality among the requests.
引用
收藏
页码:14438 / 14460
页数:22
相关论文
共 50 条
  • [31] Improving Memory Efficiency in Heterogeneous MPSoCs through Row-Buffer Locality-aware Forwarding
    Song, Yang
    Lin, Bill
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2020, 17 (01)
  • [32] InnerSP: A Memory Efficient Sparse Matrix Multiplication Accelerator with Locality-aware Inner Product Processing
    Baek, Daehyeon
    Hwang, Soojin
    Heo, Taekyung
    Kim, Daehoon
    Huh, Jaehyuk
    30TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2021), 2021, : 116 - 128
  • [33] Network Coding Aware Resource Allocation to Improve Throughput
    Zhang, Dan
    Su, Kai
    Mandayam, Narayan B.
    2012 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS (ISIT), 2012,
  • [34] Criticality-aware priority to accelerate GPU memory access
    Hossein Bitalebi
    Farshad Safaei
    The Journal of Supercomputing, 2023, 79 : 188 - 213
  • [35] PIM-Enabled Instructions: A Low-Overhead, Locality-Aware Processing-in-Memory Architecture
    Ahn, Junwhan
    Yoo, Sungjoo
    Mutlu, Onur
    Choi, Kiyoung
    2015 ACM/IEEE 42ND ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2015, : 336 - 348
  • [36] Criticality-aware priority to accelerate GPU memory access
    Bitalebi, Hossein
    Safaei, Farshad
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (01): : 188 - 213
  • [37] Scalable, resource and locality-aware selection of active scatterers in Geometry-based stochastic channel models
    Rainer, Benjamin
    Hofer, Markus
    Zelenbaba, Stefan
    Loeschenbrand, David
    Zemen, Thomas
    Ye, Xiaochun
    Priller, Peter
    2021 IEEE 32ND ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2021,
  • [38] BiloKey : A Scalable Bi-Index Locality-Aware In-Memory Key-Value Store
    Ma, Wenlong
    Zhu, Yuqing
    Li, Cheng
    Guo, Mengying
    Bao, Yungang
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (07) : 1528 - 1540
  • [39] Locality-Aware Replacement Algorithm in Flash Memory to Optimize Cloud Computing for Smart Factory of Industry 4.0
    He, Jianfan
    Jia, Gangyong
    Han, Guangjie
    Wang, Hao
    Yang, Xuan
    IEEE ACCESS, 2017, 5 : 16252 - 16262
  • [40] A run-time optimization approach for reducing data movements using locality-aware searching
    Liang Li
    Endong Wang
    Xingjun Zhang
    Kang Yan
    Tao Ju
    Xiaoshe Dong
    The Journal of Supercomputing, 2014, 69 : 864 - 886