LARA: Locality-aware resource allocation to improve GPU memory-access time

被引：0

作者：

Hossein BiTalebi

Farshad Safaei

机构：

[1] Shahid Beheshti University,Faculty of Computer Science and Engineering

来源：

The Journal of Supercomputing | 2021年 / 77卷

关键词：

Cache contention; Memory divergence; Graphics Processing Unit (GPU); GPU-NoC; Interconnection network; Locality; Memory; Priority; Row access; Stall time;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Memory access as a primary performance bottleneck of each processing unit also plays a significant role in GPU performance. In addition to high challenging parts of GPU’s memory access path, the low locality property among the requests considerably increases the memory access delay. Despite the GPU’s immense processing power, they cannot reach their maximum throughput values because of the memory access bottlenecks. Memory divergence and miss locality among the L1 missed requests significantly impose the Last-Level-Cache contention and main memory row switching overheads. In addition, interconnection network routes the request packets regardless of locality properties, such routing algorithm considerably disrupts the locality among the requests.

引用

页码：14438 / 14460

页数：22

共 50 条

[41] A run-time optimization approach for reducing data movements using locality-aware searching
Li, Liang
Wang, Endong
Zhang, Xingjun
Yan, Kang
Ju, Tao
Dong, Xiaoshe
JOURNAL OF SUPERCOMPUTING, 2014, 69 (02): : 864 - 886
[42] NV-Journaling: Locality-Aware Journaling Using Byte-Addressable Non-Volatile Memory
Chen, Cheng
Wei, Qingsong
Wong, Weng-Fai
Wang, Chundong
IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (02) : 288 - 299
[43] TORRES: A Resource-Efficient Inference Processor for Binary Convolutional Neural Networks Based on Locality-Aware Operation Skipping
Lee, Su-Jung
Kwak, Gil-Ho
Kim, Tae-Hwan
ELECTRONICS, 2022, 11 (21)
[44] Performance-based and Aging-aware Resource Allocation for Concurrent GPU Applications
Tasoulas, Zois-Gerasimos
Guss, Ryan
Anagnostopoulos, Iraklis
2018 IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI AND NANOTECHNOLOGY SYSTEMS (DFT), 2018,
[45] Performance and Aging Aware Resource Allocation for Concurrent GPU Applications Under Process Variation
Tasoulas, Zois-Gerasimos
Anagnostopoulos, Iraklis
IEEE TRANSACTIONS ON NANOTECHNOLOGY, 2019, 18 : 717 - 727
[46] QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU
Sun, Qingxiao
Yi, Liu
Yang, Hailong
Li, Mingzhen
Luan, Zhongzhi
Qian, Depei
PARALLEL COMPUTING, 2022, 113
[47] LBG-SQUARE - Fault-Tolerant, Locality-Aware Co-allocation in P2P Grids
Dethier, Gerard
Briquet, Cyril
Marchot, Pierre
de Marneffe, Pierre-Arnoul
PDCAT 2008: NINTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2008, : 252 - +
[48] Processing Time Aware Resource Allocation in Software Defined RANs
Einhaus, Michael
Kim, Igor
Charaf, Mohamad Buchr
Arnold, Paul
2019 IEEE CONFERENCE ON STANDARDS FOR COMMUNICATIONS AND NETWORKING (CSCN), 2019,
[49] ICLA Unit: Intra-Cluster Locality-Aware Unit to Reduce L2 Access and NoC Pressure in GPGPUs
Biglari Ardabili, Siamak
Zare Fatin, Gholamreza
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (01)
[50] QoS-aware dynamic resource allocation for wireless broadband access networks
Nguyen, Tri M.
Yim, Taihyung
Jeon, Youchan
Kyung, Yeunwoong
Park, Jinwoo
EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2014, : 1 - 12

← 1 2 3 4 5 →