Efficient on-chip training of large-scale optical neural network through block adjoint training algorithm

被引:0
|
作者
Yang, Zhiwei [1 ,2 ]
Zhang, Tian [1 ,2 ]
Dai, Jian [1 ,2 ]
Xu, Kun [1 ,2 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Informat Photon & Opt Commun, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Elect Engn, Beijing 100876, Peoples R China
来源
OPTICS EXPRESS | 2024年 / 32卷 / 26期
基金
中国国家自然科学基金;
关键词
DESIGN;
D O I
10.1364/OE.537813
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
MZI-based block optical neural networks (BONNs), which utilize block matrix multiplication to achieve large-scale network models, have garnered significant attention but still lack efficient training algorithms. In this article, by calculating the original field and adjoint field for the block matrices in BONNs and directly updating the phase values of all phase shifters within the optical mesh, we propose an on-chip block adjoint training (BAT) algorithm for large-scale BONNs. To demonstrate the effectiveness of our proposed algorithm, the trained BONNs are applied in image classification tasks for MNIST and SVHN datasets. The calculated results demonstrate that the performance of the BAT algorithm (95.915% for the MNIST dataset and 82.64% for the SVHN dataset) is competitive with the traditional gradient algorithm based on artificial neural networks (96.238% and 84.182%), but the BONNs can infer 1.5 times and 1.3 times faster than artificial neural networks, respectively. By studying the influence of the block size and the inputted position of the padded zero signals, we demonstrate that the BAT algorithm based on the BONNs with 12 block sizes can achieve higher performance by adding the padded zero signals to the same side beside the normal inputted signals. Additionally, we demonstrate that substituting the complete weight matrices with unitary matrices to construct BONNs is an efficient way to reduce both the system area and the required trainable parameters. Finally, we demonstrate the relatively good robustness of the BAT algorithm and the imprecision alleviation method by using on-chip retraining. Notably, our proposed BAT algorithm shows excellent potential for more complex tasks and network models.
引用
收藏
页码:46633 / 46648
页数:16
相关论文
共 50 条
  • [21] Efficient stochastic parallel gradient descent training for on-chip optical processor
    Wan, Yuanjian
    Liu, Xudong
    Wu, Guangze
    Yang, Min
    Yan, Guofeng
    Zhang, Yu
    Wang, Jian
    OPTO-ELECTRONIC ADVANCES, 2024, 7 (04)
  • [22] Efficient Communications in Training Large Scale Neural Networks
    Zhao, Yiyang
    Wang, Linnan
    Wu, Wei
    Bosilca, George
    Vuduc, Richard
    Ye, Jinmian
    Tang, Wenqi
    Xu, Zenglin
    PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 110 - 116
  • [23] On Efficient Training of Large-Scale Deep Learning Models
    Shen, Li
    Sun, Yan
    Yu, Zhiyuan
    Ding, Liang
    Tian, Xinmei
    Tao, Dacheng
    ACM COMPUTING SURVEYS, 2025, 57 (03)
  • [24] TIGER: Training Inductive Graph Neural Network for Large-scale Knowledge Graph Reasoning
    Wang, Kai
    Xu, Yuwei
    Luo, Siqiang
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (10): : 2459 - 2472
  • [25] Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantization
    Subia-Waud, Christopher
    Dasmahapatra, Srinandan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [26] NeutronSketch: An in-depth exploration of redundancy in large-scale graph neural network training
    Liu, Yajiong
    Zhang, Yanfeng
    Wang, Qiange
    Yuan, Hao
    Ai, Xin
    Yu, Ge
    KNOWLEDGE-BASED SYSTEMS, 2025, 309
  • [27] An efficient algorithm for large-scale RFID Network Planning
    Bin Hasnan, Khalid
    Talib, Nihad Hasan
    Bin Nawawi, Azli
    Abdullah, Haslina Binti
    Elewe, Adel Muhsin
    Tahir, Suhaidah
    2019 IEEE JORDAN INTERNATIONAL JOINT CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY (JEEIT), 2019, : 519 - 524
  • [28] Training of large-scale feed-forward neural networks
    Seiffert, Udo
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 5324 - 5329
  • [29] TT-GNN: Efficient On-Chip Graph Neural Network Training via Embedding Reformation and Hardware Optimization
    Qu, Zheng
    Niu, Dimin
    Li, Shuangchen
    Zheng, Hongzhong
    Xie, Yuan
    56TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2023, 2023, : 452 - 464
  • [30] Efficient Interactive Training Selection for Large-Scale Entity Resolution
    Wang, Qing
    Vatsalan, Dinusha
    Christen, Peter
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 562 - 573