NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units

被引:24
|
作者
Hyun, Bongjoon [1 ]
Kwon, Youngeun [1 ]
Choi, Yujeong [1 ]
Kim, John [1 ]
Rhu, Minsoo [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon, South Korea
来源
TWENTY-FIFTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXV) | 2020年
基金
新加坡国家研究基金会;
关键词
D O I
10.1145/3373376.3378494
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To satisfy the compute and memory demands of deep neural networks (DNNs), neural processing units (NPUs) are widely being utilized for accelerating DNNs. Similar to how GPUs have evolved from a slave device into a mainstream processor architecture, it is likely that NPUs will become first-class citizens in this fast-evolving heterogeneous architecture space. This paper makes a case for enabling address translation in NPUs to decouple the virtual and physical memory address space. Through a careful data-driven application characterization study, we root-cause several limitations of prior GPU-centric address translation schemes and propose a memory management unit (MMU) that is tailored for NPUs. Compared to an oracular MMU design point, our proposal incurs only an average 0.06% performance overhead.
引用
收藏
页码:1109 / 1124
页数:16
相关论文
共 50 条
  • [41] Efficient tool support for the optimization and calibration of electronic control units
    Jeutter, R
    TECHNOLOGY OF MEASUREMENT AND TESTING IN AUTOMOTIVE ENGINEERING, 1997, 1335 : 209 - 233
  • [42] EFFICIENT COMPUTATION AND NEURAL PROCESSING OF ASTROMETRIC IMAGES
    Cancelliere, Rossella
    Gai, Mario
    COMPUTING AND INFORMATICS, 2009, 28 (05) : 711 - 727
  • [43] Efficient 3D Transpositions in Graphics Processing Units
    Jose L. Jodra
    Ibai Gurrutxaga
    Javier Muguerza
    International Journal of Parallel Programming, 2015, 43 : 876 - 891
  • [44] Efficient Strategy for Compressing Sparse Matrices on Graphics Processing Units
    Hsu, Wei-Shu
    Hung, Che Lun
    Lin, Chun-Yuan
    Lee, Kual-Zheng
    2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PROBLEM-SOLVING (ICCP), 2013, : 5 - 8
  • [45] Computationally Efficient Tsunami Modeling on Graphics Processing Units (GPUs)
    Amouzgar, Reza
    Liang, Qiuhua
    Clarke, Peter J.
    Yasuda, Tomohiro
    Mase, Hajime
    INTERNATIONAL JOURNAL OF OFFSHORE AND POLAR ENGINEERING, 2016, 26 (02) : 154 - 160
  • [46] Efficient Regular Expression Pattern Matching on Graphics Processing Units
    Ponnemkunnath, Sudheer
    Joshi, R. C.
    CONTEMPORARY COMPUTING, 2011, 168 : 92 - 101
  • [47] Efficient Acceleration of Sparse MPIE/MoM with Graphics Processing Units
    De Donno, Danilo
    Esposito, Alessandra
    Monti, Giuseppina
    Tarricone, Luciano
    2011 41ST EUROPEAN MICROWAVE CONFERENCE, 2011, : 175 - 178
  • [48] PARALLEL EFFICIENT METHOD OF MOMENTS EXPLOITING GRAPHICS PROCESSING UNITS
    De Donno, D.
    Esposito, A.
    Monti, G.
    Tarricone, L.
    MICROWAVE AND OPTICAL TECHNOLOGY LETTERS, 2010, 52 (11) : 2568 - 2572
  • [49] Efficient RDF Stream Reasoning with Graphics Processing Units (GPUs)
    Liu, Chang
    Urbani, Jacopo
    Qi, Guilin
    WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, : 343 - 344
  • [50] EFFICIENT BIT STRING HANDLING WITH STANDARD PROCESSING UNITS.
    Mumprecht, E.
    IBM technical disclosure bulletin, 1984, 26 (10 A): : 4912 - 4914