HyGain: High-performance, Energy-efficient Hybrid Gain Cell-based Cache Hierarchy

被引:0
|
作者
Singh, Sarabjeet [1 ]
Surana, Neelam [2 ]
Prasad, Kailash [3 ]
Jain, Pranjali [4 ]
Mekie, Joycee [3 ]
Awasthi, Manu [5 ]
机构
[1] Univ Utah, Salt Lake City, UT 84112 USA
[2] NVIDIA Graph, Hyderabad, Telangana, India
[3] Indian Inst Technol, Dept Elect Engn, Gandhinagar, Gujarat, India
[4] Univ Calif Santa Barbara, Santa Barbara, CA USA
[5] Ashoka Univ, Hyderabad, Telangana, India
关键词
Cache memory; emerging memories; Gain Cell; EMBEDDED DRAM; LOW-COST; STT-RAM; REFRESH; POWER; ARCHITECTURE; PREDICTION; SRAM;
D O I
10.1145/3572839
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we propose a "full-stack" solution to designing high-apacity and low-latency on-chip cache hierarchies by starting at the circuit level of the hardware design stack. We propose a novel half VDD precharge 2T Gain Cell (GC) design for the cache hierarchy. The GC has several desirable characteristics, including similar to 50% higher storage density and similar to 50% lower dynamic energy as compared to the traditional 6T SRAM, even after accounting for peripheral circuit overheads. We also demonstrate data retention time of 350 us (similar to 17.5x of eDRAM) at 28 nm technology with V-DD = 0.9V and temperature = 27 degrees C that, combined with optimizations like staggered refresh, makes it an ideal candidate to architect all levels of on-chip caches. We show that compared to 6T SRAM, for a given area budget, GC-based caches, on average, provide 30% and 36% increase in IPC for single- and multi-programmed workloads, respectively, on contemporary workloads, including SPEC CPU 2017. We also observe dynamic energy savings of 42% and 34% for single- and multi-programmed workloads, respectively. Finally, in a quest to utilize the best of all worlds, we combine GC with STT-RAM to create hybrid hierarchies. We show that a hybrid hierarchy with GC caches at L1 and L2 and an LLC split between GC and STT-RAM is able to provide a 46% benefit in energy-delay product (EDP) as compared to an all-SRAM design, and 13% as compared to an all-GC cache hierarchy, averaged across multi-programmed workloads.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Nanowire FET With Corner Spacer for High-Performance, Energy-Efficient Applications
    Sachid, Angada B.
    Lin, Hsiang-Yun
    Hu, Chenming
    IEEE TRANSACTIONS ON ELECTRON DEVICES, 2017, 64 (12) : 5181 - 5187
  • [42] High-Performance and Scalable Organosilicon Membranes for Energy-Efficient Alcohol Purification
    Zhu, Tengyang
    Shen, Dongchen
    Dong, Jiayu
    Liu, Huan
    Xia, Qing
    Li, Song
    Shao, Lu
    Wang, Yan
    ADVANCED FUNCTIONAL MATERIALS, 2025, 35 (07)
  • [43] TuNao: A High-Performance and Energy-Efficient Reconfigurable Accelerator for Graph Processing
    Zhou, Jinhong
    Liu, Shaoli
    Guo, Qi
    Zhou, Xuda
    Zhi, Tian
    Liu, Daofu
    Wang, Chao
    Zhou, Xuehai
    Chen, Yunji
    Chen, Tianshi
    2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 731 - 734
  • [44] High-performance energy-efficient D-flip-flop circuits
    Ko, UM
    Balsara, PT
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2000, 8 (01) : 94 - 98
  • [45] High-Performance Energy-Efficient NoC Fabrics: Evolution and Future Challenges
    Anders, Mark A.
    2014 EIGHTH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS), 2014, : I - I
  • [46] Parallelization strategies for high-performance and energy-efficient epidemic spread simulations
    Cagigas-Muniz, Daniel
    Diaz-del-Rio, Fernando
    Sevillano-Ramos, Jose Luis
    Guisado-Lizar, Jose-Luis
    SIMULATION MODELLING PRACTICE AND THEORY, 2025, 140
  • [47] Thread Batching for High-performance Energy-efficient GPU Memory Design
    Li, Bing
    Mao, Mengjie
    Liu, Xiaoxiao
    Liu, Tao
    Liu, Zihao
    Wen, Wujie
    Chen, Yiran
    Li, Hai
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2019, 15 (04)
  • [48] High-performance memristor for energy-efficient artificial optoelectronic synapse based on BiVO 4 nanosheets
    Zhong, Yang
    Yin, Jinxiang
    Li, Mei
    He, Yanyan
    Lei, Peixian
    Zhong, Lun
    Liao, Kanghong
    Wu, Haijuan
    Wang, Zegao
    Jie, Wenjing
    JOURNAL OF ALLOYS AND COMPOUNDS, 2024, 991
  • [49] Comments on "High-Performance and Energy-Efficient CNFET-Based Designs for Ternary Logic Circuits"
    Etiemble, Daniel
    IEEE ACCESS, 2020, 8 : 220015 - 220016
  • [50] Restricting Writes for Energy-Efficient Hybrid Cache in Multi-Core Architectures
    Agarwal, Sukarn
    Kapoor, Hemangee K.
    2016 IFIP/IEEE INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2016,