A Low Memory Requirement MobileNets Accelerator Based on FPGA for Auxiliary Medical Tasks

被引:3
|
作者
Lin, Yanru [1 ]
Zhang, Yanjun [2 ]
Yang, Xu [3 ]
机构
[1] Beijing Inst Technol, Sch Integrated Circuits & Elect, 5 South St, Beijing 100081, Peoples R China
[2] Beijing Inst Technol, Sch Cyberspace Sci & Technol, 5 South St, Beijing 100081, Peoples R China
[3] Beijing Inst Technol, Sch Comp Sci & Technol, 5 South St, Beijing 100081, Peoples R China
来源
BIOENGINEERING-BASEL | 2023年 / 10卷 / 01期
关键词
convolutional neural network; FPGA; hardware accelerator; MobileNetV2; auxiliary medical tasks;
D O I
10.3390/bioengineering10010028
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Convolutional neural networks (CNNs) have been widely applied in the fields of medical tasks because they can achieve high accuracy in many fields using a large number of parameters and operations. However, many applications designed for auxiliary checks or help need to be deployed into portable devices, where the huge number of operations and parameters of a standard CNN can become an obstruction. MobileNet adopts a depthwise separable convolution to replace the standard convolution, which can greatly reduce the number of operations and parameters while maintaining a relatively high accuracy. Such highly structured models are very suitable for FPGA implementation in order to further reduce resource requirements and improve efficiency. Many other implementations focus on performance more than on resource requirements because MobileNets has already reduced both parameters and operations and obtained significant results. However, because many small devices only have limited resources they cannot run MobileNet-like efficient networks in a normal way, and there are still many auxiliary medical applications that require a high-performance network running in real-time to meet the requirements. Hence, we need to figure out a specific accelerator structure to further reduce the memory and other resource requirements while running MobileNet-like efficient networks. In this paper, a MobileNet accelerator is proposed to minimize the on-chip memory capacity and the amount of data that is transferred between on-chip and off-chip memory. We propose two configurable computing modules: Pointwise Convolution Accelerator and Depthwise Convolution Accelerator, to parallelize the network and reduce the memory requirement with a specific dataflow model. At the same time, a new cache usage method is also proposed to further reduce the use of the on-chip memory. We implemented the accelerator on Xilinx XC7Z020, deployed MobileNetV2 on it, and achieved 70.94 FPS with 524.25 KB on-chip memory usage under 150 MHz.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] FPGA-based Low-Batch Training Accelerator for Modern CNNs Featuring High Bandwidth Memory
    Venkataramanaiah, Shreyas K.
    Suh, Han-Sok
    Yin, Shihui
    Nurvitadhi, Eriko
    Dasu, Aravind
    Cao, Yu
    Seo, Jae-Sun
    2020 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED-DESIGN (ICCAD), 2020,
  • [2] An Auxiliary Tasks Based Framework for Automated Medical Skill Assessment with Limited Data
    Zhao, Shang
    Zhang, Xiaoke
    Jin, Fang
    Hahn, James
    2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 1613 - 1617
  • [3] An Accelerator for Resolution Proof Checking based on FPGA and Hybrid Memory Cube Technology
    Tim Hansmeier
    Marco Platzner
    Md Jubaer Hossain Pantho
    David Andrews
    Journal of Signal Processing Systems, 2019, 91 : 1259 - 1272
  • [4] An Accelerator for Resolution Proof Checking based on FPGA and Hybrid Memory Cube Technology
    Hansmeier, Tim
    Platzner, Marco
    Pantho, Md Jubaer Hossain
    Andrews, David
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2019, 91 (11-12): : 1259 - 1272
  • [5] A Low Power and Low Latency FPGA-Based Spiking Neural Network Accelerator
    Liu, Hanwen
    Chen, Yi
    Zeng, Zihang
    Zhang, Malu
    Qu, Hong
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [6] FitNN: A Low-Resource FPGA-Based CNN Accelerator for Drones
    Zhang, Zhichao
    Mahmud, M. A. Parvez
    Kouzani, Abbas Z.
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (21) : 21357 - 21369
  • [7] An FPGA-Based Reconfigurable Accelerator for Low-Bit DNN Training
    Shao, Haikuo
    Lu, Jinming
    Lin, Jun
    Wang, Zhongfeng
    2021 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2021), 2021, : 254 - 259
  • [8] DeCO: A DSP Block Based FPGA Accelerator Overlay With Low Overhead Interconnect
    Jain, Abhishek Kumar
    Li, Xiangwei
    Singhai, Pranjul
    Maskell, Douglas L.
    Fahmy, Suhaib A.
    2016 IEEE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2016, : 1 - 8
  • [9] A low-latency LSTM accelerator using balanced sparsity based on FPGA
    Jiang, Jingfei
    Xiao, Tao
    Xu, Jinwei
    Wen, Dong
    Gao, Lei
    Dou, Yong
    MICROPROCESSORS AND MICROSYSTEMS, 2022, 89
  • [10] FPGA-based Accelerator for Long Short-Term Memory Recurrent Neural Networks
    Guan, Yijin
    Yuan, Zhihang
    Sun, Guangyu
    Cong, Jason
    2017 22ND ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2017, : 629 - 634