Efficient on-chip training of large-scale optical neural network through block adjoint training algorithm

被引：0

作者：

Yang, Zhiwei ^{[1
,2
]}

Zhang, Tian ^{[1
,2
]}

Dai, Jian ^{[1
,2
]}

Xu, Kun ^{[1
,2
]}

机构：

[1] Beijing Univ Posts & Telecommun, State Key Lab Informat Photon & Opt Commun, Beijing 100876, Peoples R China

[2] Beijing Univ Posts & Telecommun, Sch Elect Engn, Beijing 100876, Peoples R China

来源：

OPTICS EXPRESS | 2024年 / 32卷 / 26期

基金：

中国国家自然科学基金;

关键词：

DESIGN;

D O I：

10.1364/OE.537813

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

MZI-based block optical neural networks (BONNs), which utilize block matrix multiplication to achieve large-scale network models, have garnered significant attention but still lack efficient training algorithms. In this article, by calculating the original field and adjoint field for the block matrices in BONNs and directly updating the phase values of all phase shifters within the optical mesh, we propose an on-chip block adjoint training (BAT) algorithm for large-scale BONNs. To demonstrate the effectiveness of our proposed algorithm, the trained BONNs are applied in image classification tasks for MNIST and SVHN datasets. The calculated results demonstrate that the performance of the BAT algorithm (95.915% for the MNIST dataset and 82.64% for the SVHN dataset) is competitive with the traditional gradient algorithm based on artificial neural networks (96.238% and 84.182%), but the BONNs can infer 1.5 times and 1.3 times faster than artificial neural networks, respectively. By studying the influence of the block size and the inputted position of the padded zero signals, we demonstrate that the BAT algorithm based on the BONNs with 12 block sizes can achieve higher performance by adding the padded zero signals to the same side beside the normal inputted signals. Additionally, we demonstrate that substituting the complete weight matrices with unitary matrices to construct BONNs is an efficient way to reduce both the system area and the required trainable parameters. Finally, we demonstrate the relatively good robustness of the BAT algorithm and the imprecision alleviation method by using on-chip retraining. Notably, our proposed BAT algorithm shows excellent potential for more complex tasks and network models.

引用

页码：46633 / 46648

页数：16

共 50 条

[21] Efficient stochastic parallel gradient descent training for on-chip optical processor
Wan, Yuanjian
Liu, Xudong
Wu, Guangze
Yang, Min
Yan, Guofeng
Zhang, Yu
Wang, Jian
OPTO-ELECTRONIC ADVANCES, 2024, 7 (04)
[22] Efficient Communications in Training Large Scale Neural Networks
Zhao, Yiyang
Wang, Linnan
Wu, Wei
Bosilca, George
Vuduc, Richard
Ye, Jinmian
Tang, Wenqi
Xu, Zenglin
PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 110 - 116
[23] On Efficient Training of Large-Scale Deep Learning Models
Shen, Li
Sun, Yan
Yu, Zhiyuan
Ding, Liang
Tian, Xinmei
Tao, Dacheng
ACM COMPUTING SURVEYS, 2025, 57 (03)
[24] TIGER: Training Inductive Graph Neural Network for Large-scale Knowledge Graph Reasoning
Wang, Kai
Xu, Yuwei
Luo, Siqiang
PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (10): : 2459 - 2472
[25] Probabilistic Weight Fixing: Large-scale training of neural network weight uncertainties for quantization
Subia-Waud, Christopher
Dasmahapatra, Srinandan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[26] NeutronSketch: An in-depth exploration of redundancy in large-scale graph neural network training
Liu, Yajiong
Zhang, Yanfeng
Wang, Qiange
Yuan, Hao
Ai, Xin
Yu, Ge
KNOWLEDGE-BASED SYSTEMS, 2025, 309
[27] An efficient algorithm for large-scale RFID Network Planning
Bin Hasnan, Khalid
Talib, Nihad Hasan
Bin Nawawi, Azli
Abdullah, Haslina Binti
Elewe, Adel Muhsin
Tahir, Suhaidah
2019 IEEE JORDAN INTERNATIONAL JOINT CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY (JEEIT), 2019, : 519 - 524
[28] Training of large-scale feed-forward neural networks
Seiffert, Udo
2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 5324 - 5329
[29] TT-GNN: Efficient On-Chip Graph Neural Network Training via Embedding Reformation and Hardware Optimization
Qu, Zheng
Niu, Dimin
Li, Shuangchen
Zheng, Hongzhong
Xie, Yuan
56TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2023, 2023, : 452 - 464
[30] Efficient Interactive Training Selection for Large-Scale Entity Resolution
Wang, Qing
Vatsalan, Dinusha
Christen, Peter
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 562 - 573

← 1 2 3 4 5 →