Parallelizing SRAM Arrays with Customized Bit-Cell for Binary Neural Networks

被引:0
|
作者
Liu, Rui [1 ]
Peng, Xiaochen [1 ]
Sun, Xiaoyu [1 ]
Khwa, Win-San [2 ]
Si, Xin [2 ]
Chen, Jia-Jing [2 ]
Li, Jia-Fang [2 ]
Chang, Meng-Fan [2 ]
Yu, Shimeng [1 ]
机构
[1] Arizona State Univ, Tempe, AZ 85287 USA
[2] Natl Tsing Hua Univ, Hsinchu, Taiwan
关键词
D O I
10.1145/3195970.3196089
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in deep neural networks (DNNs) have shown Binary Neural Networks (BNNs) are able to provide a reasonable accuracy on various image datasets with a significant reduction in computation and memory cost. In this paper, we explore two BNNs: hybrid BNN (HBNN) and XNORBNN, where the weights are binarized to +1/-1 while the neuron activations are binarized to 1/0 and +1/-1, respectively. Two SRAM bit cell designs are proposed, namely, 6T SRAM for HBNN and customized 8T SRAM for XNOR-BNN. In our design, the high-precision multiply-and-accumulate (MAC) is replaced by bitwise multiplication for HBNN or XNOR for XNOR-BNN plus bit-counting operations. To parallelize the weighted sum operation, we activate multiple word lines in the SRAM array simultaneously and digitize the analog voltage developed along the bit line by a multi-level sense amplifier (MLSA). In order to partition the large matrices in DNNs, we investigate the impact of sensing bit-levels of MLSA on the accuracy degradation for different sub-array sizes and propose using the nonlinear quantization technique to mitigate the accuracy degradation. With 64x64 sub-array size and 3-bit MLSA, HBNN and XNORBNN architectures can minimize the accuracy degradation to 2.37% and 0.88%, respectively, for an inspired VGG-16 network on the CIFAR-10 dataset. Design space exploration of SRAM based synaptic architectures with the conventional row-by-row access scheme and our proposed parallel access scheme are also performed, showing significant benefits in the area, latency and energy-efficiency. Finally, we have successfully taped-out and validated the proposed HBNN and XNOR-BNN designs in TSMC 65 nm process with measured silicon data, achieving energy efficiency >100 TOPS/W for HBNN and >50 TOPS/W for XNOR-BNN.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] An Energy-Efficient and High Throughput in-Memory Computing Bit-Cell With Excellent Robustness Under Process Variations for Binary Neural Network
    Saha, Gobinda
    Jiang, Zhewei
    Parihar, Sanjay
    Cao, Xi
    Higman, Jack
    Ul Karim, Muhammed Ahosan
    IEEE ACCESS, 2020, 8 : 91405 - 91414
  • [22] Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks
    Wang, Yikai
    Yang, Yi
    Sun, Fuchun
    Yao, Anbang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5340 - 5349
  • [23] Leakage Current Optimization in 9T SRAM Bit-cell with Sleep Transistor at 45nm CMOS Technology
    Ruhil, Shaifali
    Shukla, Neeraj Kr.
    2017 INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES FOR SMART NATION (IC3TSN), 2017, : 259 - 261
  • [24] XNOR-SRAM: In-Memory Computing SRAM Macro for Binary/Ternary Deep Neural Networks
    Jiang, Zhewei
    Yin, Shihui
    Seok, Mingoo
    Seo, Jae-sun
    2018 IEEE SYMPOSIUM ON VLSI TECHNOLOGY, 2018, : 173 - 174
  • [25] XNOR-SRAM: In-Memory Computing SRAM Macro for Binary/Ternary Deep Neural Networks
    Yin, Shihui
    Jiang, Zhewei
    Seo, Jae-Sun
    Seok, Mingoo
    IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (06) : 1733 - 1743
  • [26] Analysis of Triple-Threshold Technique for Power Optimization in SRAM Bit-Cell for Low-Power Applications at 45 Nm CMOS Technology
    Kumar, Sudershan
    Ruhil, Shaifali
    Shukla, Neeraj Kr
    Birla, Shilpi
    INTELLIGENT COMPUTING TECHNIQUES FOR SMART ENERGY SYSTEMS, 2020, 607 : 611 - 618
  • [27] A PVT-robust Customized 4T Embedded DRAM Cell Array for Accelerating Binary Neural Networks
    Shin, Hyein
    Sim, Jaehyeong
    Lee, Daewoong
    Kim, Lee-Sup
    2019 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2019,
  • [28] Learning Optimum Binary Color Filter Arrays for Demosaicing with Neural Networks
    Ayna, Cemre Omer
    Gurbuz, Ali Cafer
    REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2024, 2024, 13034
  • [29] An Efficient Convolutional Neural Networks Design with Heterogeneous SRAM Cell Sizing
    Choi, Wonseok
    Park, Jongsun
    PROCEEDINGS INTERNATIONAL SOC DESIGN CONFERENCE 2017 (ISOCC 2017), 2017, : 103 - 104
  • [30] Design and Analysis of a Novel Low-Power SRAM Bit-Cell Structure at Deep-Sub-Micron CMOS Technology for Mobile Multimedia Applications
    Shukla, Neeraj Kr.
    Singh, R. K.
    Pattanaik, Manisha
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (05) : 43 - 49