NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units

被引：24

作者：

Hyun, Bongjoon ^{[1
]}

Kwon, Youngeun ^{[1
]}

Choi, Yujeong ^{[1
]}

Kim, John ^{[1
]}

Rhu, Minsoo ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon, South Korea

来源：

TWENTY-FIFTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXV) | 2020年

基金：

新加坡国家研究基金会;

关键词：

D O I：

10.1145/3373376.3378494

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

To satisfy the compute and memory demands of deep neural networks (DNNs), neural processing units (NPUs) are widely being utilized for accelerating DNNs. Similar to how GPUs have evolved from a slave device into a mainstream processor architecture, it is likely that NPUs will become first-class citizens in this fast-evolving heterogeneous architecture space. This paper makes a case for enabling address translation in NPUs to decouple the virtual and physical memory address space. Through a careful data-driven application characterization study, we root-cause several limitations of prior GPU-centric address translation schemes and propose a memory management unit (MMU) that is tailored for NPUs. Compared to an oracular MMU design point, our proposal incurs only an average 0.06% performance overhead.

引用

页码：1109 / 1124

页数：16

共 50 条

[1] Architectural Support for Address Translation on GPUs Designing Memory Management Units for CPU/GPUs with Unified Address Spaces
Pichai, Bharath
Hsu, Lisa
Bhattacharjee, Abhishek
ACM SIGPLAN NOTICES, 2014, 49 (04) : 743 - 757
[2] Tool Support for Efficient Programming of Graphics Processing Units
Damevski, Kostadin
BRIDGING MATHEMATICS, STATISTICS, ENGINEERING AND TECHNOLOGY, 2012, 24 : 97 - 103
[3] Architectural Support for Efficient Large-Scale Automata Processing
Liu, Hongyuan
Ibrahim, Mohamed
Kayiran, Onur
Pai, Sreepathi
Jog, Adwait
2018 51ST ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2018, : 908 - 920
[4] ARCHITECTURAL SUPPORT FOR COGNITIVE PROCESSING Introduction
Bose, Pradip
Buyuktosunoglu, Alper
IEEE MICRO, 2017, 37 (01) : 6 - 7
[5] Architectural software support for processing clusters
Gutleber, J
Cano, E
Cittolin, S
Meijers, F
Orsini, L
Samyn, D
CLUSTER 2000: IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, PROCEEDINGS, 2000, : 153 - 161
[6] MERLOT: Architectural Support for Energy-Efficient Real-time Processing in GPUs
Santriaji, Muhammad Husni
Hoffmann, Henry
24TH IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2018), 2018, : 214 - 226
[7] ARCHITECTURAL SUPPORT FOR SINGLE ADDRESS SPACE OPERATING-SYSTEMS
KOLDINGER, EJ
CHASE, JS
EGGERS, SJ
SIGPLAN NOTICES, 1992, 27 (09): : 175 - 186
[8] Architectural support for efficient multicasting in irregular networks
Sivaram, R
Kesavan, R
Panda, DK
Stunkel, CB
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2001, 12 (05) : 489 - 513
[9] iWatcher: Efficient architectural support for software debugging
Zhou, P
Qin, F
Liu, W
Zhou, YY
Torrellas, J
31ST ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, PROCEEDINGS, 2004, : 224 - 235
[10] System Virtualization for Neural Processing Units
Xue, Yuqi
Liu, Yiqi
Huang, Jian
PROCEEDINGS OF THE 19TH WORKSHOP ON HOT TOPICS IN OPERATING SYSTEMS, HOTOS 2023, 2023, : 80 - 86

← 1 2 3 4 5 →