A Scalable Virtual memory system based on decentralization for many-cores

被引:3
|
作者
Cai, Miao [1 ]
Zhang, Diming [2 ]
Huang, Hao [1 ]
机构
[1] Nanjing Univ, Dept Comp Sci & Technol, Nanjing, Jiangsu, Peoples R China
[2] Jiangsu Univ Sci & Technol, Zhenjiang, Jiangsu, Peoples R China
关键词
Virtual memory system; Scalability; Many-core; RECLAMATION; LOCK;
D O I
10.1016/j.sysarc.2020.101803
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Traditional centralized virtual memory system design encounters severe scalability problems, which impede the multithreaded applications' performance increment on many-core systems. In this paper, we propose a decentralized system model to scale the VM systems for many-cores. Our model improves system parallelism by avoiding resource sharing and minimizing state coordination. By applying the model, we build a novel scalable virtual memory system called MEDusAVM +. MEDusAVM + presents a decentralized system architecture, which avoids resource conflicts or cache line contention among processors or threads. Furthermore, MEDusAVM + provides a scalable address space by incorporating decentralized VM space management and a hybrid page table design. Critical system services and internal system operations, such as TLB coherence, are also fully optimized to maximize the system parallelism. Our prototype system is implemented based on the Linux kernel 4.4.0 and glibc 2.23. Experimental results evaluated on a 72-core machine demonstrate that MEDusAVM + scales much better than the state-of-the-art systems and decreases the memory consumption by up to 27 x compared with current approaches. For microbenchmark experiments, MEDusAVM + achieves nearly linear performance speedup. When evaluated with multithreaded applications, MEDusAVM + also outperforms other systems by up to a factor of 4.5 x.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Fast and Lightweight Support for Nested Parallelism on Cluster-Based Embedded Many-Cores
    Marongiu, Andrea
    Burgio, Paolo
    Benini, Luca
    DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2012), 2012, : 105 - 110
  • [42] SWITCHES: A Lightweight Runtime for Dataflow Execution of Tasks on Many-Cores
    Diavastos, Andreas
    Trancoso, Pedro
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (03)
  • [43] Parallel programming models for heterogeneous many-cores: a comprehensive survey
    Fang, Jianbin
    Huang, Chun
    Tang, Tao
    Wang, Zheng
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2020, 2 (04) : 382 - 400
  • [44] Distributed Scheduling for Many-Cores Using Cooperative Game Theory
    Pathania, Anuj
    Venkataramani, Vanchinathan
    Shafique, Muhammad
    Mitra, Tulika
    Henkel, Joerg
    2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,
  • [45] KTS: a real-time mapping algorithm for NoC-based many-cores
    Queudet, Audrey
    Abdallah, Nadine
    Chetto, Maryline
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (08): : 3635 - 3651
  • [46] Parallelizing and optimizing maximum noise fraction rotation on multi-cores and many-cores
    Fang, Min-Quan
    Zhang, Wei-Min
    Gao, Chang
    Fang, Jian-Bin
    Ruan Jian Xue Bao/Journal of Software, 2015, 26 : 247 - 256
  • [47] SPARTA: Runtime Task Allocation for Energy Efficient Heterogeneous Many-cores
    Donyanavard, Bryan
    Mueck, Tiago
    Sarma, Santanu
    Dutt, Nikil
    2016 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2016,
  • [48] A triple hybrid interconnect for many-cores: Reconfigurable mesh, NoC and barrier
    Giefers, Heiner
    Platzner, Marco
    Proceedings - 2010 International Conference on Field Programmable Logic and Applications, FPL 2010, 2010, : 223 - 228
  • [49] Amphisbaena: Modeling Two Orthogonal Ways to Hunt on Heterogeneous Many-cores
    Ma, Jun
    Yan, Guihai
    Han, Yinhe
    Li, Xiaowei
    2014 19TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2014, : 394 - 399
  • [50] AM3: Towards A Hardware Unix Accelerator for Many-Cores
    Poss, Raphael
    Koning, Koen
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (08) : 2208 - 2221