LARA: Locality-aware resource allocation to improve GPU memory-access time

被引:0
|
作者
Hossein BiTalebi
Farshad Safaei
机构
[1] Shahid Beheshti University,Faculty of Computer Science and Engineering
来源
关键词
Cache contention; Memory divergence; Graphics Processing Unit (GPU); GPU-NoC; Interconnection network; Locality; Memory; Priority; Row access; Stall time;
D O I
暂无
中图分类号
学科分类号
摘要
Memory access as a primary performance bottleneck of each processing unit also plays a significant role in GPU performance. In addition to high challenging parts of GPU’s memory access path, the low locality property among the requests considerably increases the memory access delay. Despite the GPU’s immense processing power, they cannot reach their maximum throughput values because of the memory access bottlenecks. Memory divergence and miss locality among the L1 missed requests significantly impose the Last-Level-Cache contention and main memory row switching overheads. In addition, interconnection network routes the request packets regardless of locality properties, such routing algorithm considerably disrupts the locality among the requests.
引用
收藏
页码:14438 / 14460
页数:22
相关论文
共 50 条
  • [41] A run-time optimization approach for reducing data movements using locality-aware searching
    Li, Liang
    Wang, Endong
    Zhang, Xingjun
    Yan, Kang
    Ju, Tao
    Dong, Xiaoshe
    JOURNAL OF SUPERCOMPUTING, 2014, 69 (02): : 864 - 886
  • [42] NV-Journaling: Locality-Aware Journaling Using Byte-Addressable Non-Volatile Memory
    Chen, Cheng
    Wei, Qingsong
    Wong, Weng-Fai
    Wang, Chundong
    IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (02) : 288 - 299
  • [43] TORRES: A Resource-Efficient Inference Processor for Binary Convolutional Neural Networks Based on Locality-Aware Operation Skipping
    Lee, Su-Jung
    Kwak, Gil-Ho
    Kim, Tae-Hwan
    ELECTRONICS, 2022, 11 (21)
  • [44] Performance-based and Aging-aware Resource Allocation for Concurrent GPU Applications
    Tasoulas, Zois-Gerasimos
    Guss, Ryan
    Anagnostopoulos, Iraklis
    2018 IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI AND NANOTECHNOLOGY SYSTEMS (DFT), 2018,
  • [45] Performance and Aging Aware Resource Allocation for Concurrent GPU Applications Under Process Variation
    Tasoulas, Zois-Gerasimos
    Anagnostopoulos, Iraklis
    IEEE TRANSACTIONS ON NANOTECHNOLOGY, 2019, 18 : 717 - 727
  • [46] QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU
    Sun, Qingxiao
    Yi, Liu
    Yang, Hailong
    Li, Mingzhen
    Luan, Zhongzhi
    Qian, Depei
    PARALLEL COMPUTING, 2022, 113
  • [47] LBG-SQUARE - Fault-Tolerant, Locality-Aware Co-allocation in P2P Grids
    Dethier, Gerard
    Briquet, Cyril
    Marchot, Pierre
    de Marneffe, Pierre-Arnoul
    PDCAT 2008: NINTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2008, : 252 - +
  • [48] Processing Time Aware Resource Allocation in Software Defined RANs
    Einhaus, Michael
    Kim, Igor
    Charaf, Mohamad Buchr
    Arnold, Paul
    2019 IEEE CONFERENCE ON STANDARDS FOR COMMUNICATIONS AND NETWORKING (CSCN), 2019,
  • [49] ICLA Unit: Intra-Cluster Locality-Aware Unit to Reduce L2 Access and NoC Pressure in GPGPUs
    Biglari Ardabili, Siamak
    Zare Fatin, Gholamreza
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (01)
  • [50] QoS-aware dynamic resource allocation for wireless broadband access networks
    Nguyen, Tri M.
    Yim, Taihyung
    Jeon, Youchan
    Kyung, Yeunwoong
    Park, Jinwoo
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2014, : 1 - 12