Efficient on-chip training of large-scale optical neural network through block adjoint training algorithm

被引:0
|
作者
Yang, Zhiwei [1 ,2 ]
Zhang, Tian [1 ,2 ]
Dai, Jian [1 ,2 ]
Xu, Kun [1 ,2 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Informat Photon & Opt Commun, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Elect Engn, Beijing 100876, Peoples R China
来源
OPTICS EXPRESS | 2024年 / 32卷 / 26期
基金
中国国家自然科学基金;
关键词
DESIGN;
D O I
10.1364/OE.537813
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
MZI-based block optical neural networks (BONNs), which utilize block matrix multiplication to achieve large-scale network models, have garnered significant attention but still lack efficient training algorithms. In this article, by calculating the original field and adjoint field for the block matrices in BONNs and directly updating the phase values of all phase shifters within the optical mesh, we propose an on-chip block adjoint training (BAT) algorithm for large-scale BONNs. To demonstrate the effectiveness of our proposed algorithm, the trained BONNs are applied in image classification tasks for MNIST and SVHN datasets. The calculated results demonstrate that the performance of the BAT algorithm (95.915% for the MNIST dataset and 82.64% for the SVHN dataset) is competitive with the traditional gradient algorithm based on artificial neural networks (96.238% and 84.182%), but the BONNs can infer 1.5 times and 1.3 times faster than artificial neural networks, respectively. By studying the influence of the block size and the inputted position of the padded zero signals, we demonstrate that the BAT algorithm based on the BONNs with 12 block sizes can achieve higher performance by adding the padded zero signals to the same side beside the normal inputted signals. Additionally, we demonstrate that substituting the complete weight matrices with unitary matrices to construct BONNs is an efficient way to reduce both the system area and the required trainable parameters. Finally, we demonstrate the relatively good robustness of the BAT algorithm and the imprecision alleviation method by using on-chip retraining. Notably, our proposed BAT algorithm shows excellent potential for more complex tasks and network models.
引用
收藏
页码:46633 / 46648
页数:16
相关论文
共 50 条
  • [31] A parallel SVM training algorithm on large-scale classification problems
    Zhang, JP
    Li, ZW
    Yang, J
    Proceedings of 2005 International Conference on Machine Learning and Cybernetics, Vols 1-9, 2005, : 1637 - 1641
  • [32] ETC: Efficient Training of Temporal Graph Neural Networks over Large-scale Dynamic Graphs
    Gao, Shihong
    Li, Yiming
    Shen, Yanyan
    Shao, Yingxia
    Chen, Lei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (05): : 1060 - 1072
  • [33] Hybrid circuit-switched network for on-chip communication in large-scale chip-multiprocessors
    Luo, Hongyin
    Wei, Shaojun
    Chen, Deming
    Guo, Donghui
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (09) : 2818 - 2830
  • [34] EFFICIENT FPGA MAPPING OF GILBERT'S ALGORITHM FOR SVM TRAINING ON LARGE-SCALE CLASSIFICATION PROBLEMS
    Papadonikolakis, Markos
    Bouganis, Christos-Savvas
    2008 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE AND LOGIC APPLICATIONS, VOLS 1 AND 2, 2008, : 384 - 389
  • [35] Understanding the Implication of Non-Volatile Memory for Large-Scale Graph Neural Network Training
    Lee, Yunjae
    Kwon, Youngeun
    Rhu, Minsoo
    IEEE COMPUTER ARCHITECTURE LETTERS, 2021, 20 (02) : 118 - 121
  • [36] An Allreduce Algorithm and Network Co-design for Large-Scale Training of Distributed Deep Learning
    Nguyen, Truong Thao
    Wahib, Mohamed
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 396 - 405
  • [37] Research on Fault-Tolerant Algorithm for Memristor Neural Networks Based on On-Chip Training
    Wang, Lei
    Wu, Youyu
    2024 9TH INTERNATIONAL CONFERENCE ON ELECTRONIC TECHNOLOGY AND INFORMATION SCIENCE, ICETIS 2024, 2024, : 221 - 224
  • [38] Efficient training for the hybrid optical diffractive deep neural network
    Fang, Tao
    Lia, Jingwei
    Wu, Tongyu
    Cheng, Ming
    Dong, Xiaowen
    AI AND OPTICAL DATA SCIENCES III, 2022, 12019
  • [39] A Novel Evolutionary Algorithm For Block-Based Neural Network Training
    Niknam, Amin
    Hoseini, Pourya
    Mashoufi, Behbood
    Khoei, Abdollah
    2013 FIRST IRANIAN CONFERENCE ON PATTERN RECOGNITION AND IMAGE ANALYSIS (PRIA), 2013,
  • [40] A generic control block for feedforward neural network with on-chip delta rule learning algorithm
    Tisan, Alin
    Buchman, A.
    Oniga, S.
    Gavrincea, C.
    2007 30TH INTERNATIONAL SPRING SEMINAR ON ELECTRONICS TECHNOLOGY, 2007, : 567 - 570