Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

被引:0
|
作者
Li Jing [1 ]
Shen, Yichen [1 ]
Dubcek, Tena [1 ]
Peurifoy, John [1 ]
Skirlo, Scott [1 ]
LeCun, Yann [2 ]
Tegmark, Max [1 ]
Soljacic, Marin [1 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] New York Univ, Facebook AI Res, New York, NY 10003 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely O(1) per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] AN APPLICATION OF NEURAL NETWORKS IN CHEMISTRY
    KVASNICKA, V
    CHEMICAL PAPERS, 1990, 44 (06): : 775 - 792
  • [32] The application of OBE to neural networks
    Jiang, Y
    He, Q
    Tong, TS
    Dilger, W
    2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 961 - 966
  • [33] Differentiable neural architecture learning for efficient neural networks
    Guo, Qingbei
    Wu, Xiao-Jun
    Kittler, Josef
    Feng, Zhiquan
    PATTERN RECOGNITION, 2022, 126
  • [34] Tunable Nonlinear Activation Functions for Optical Neural Networks
    Williamson, Ian A. D.
    Hughes, Tyler W.
    Minkov, Momchil
    Bartlett, Ben
    Pai, Sunil
    Fan, Shanhui
    2020 CONFERENCE ON LASERS AND ELECTRO-OPTICS (CLEO), 2020,
  • [35] Tunable Floating-Point for Artificial Neural Networks
    Franceschi, Marta
    Nannarelli, Alberto
    Valle, Maurizio
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2018, : 289 - 292
  • [36] An extended class of synaptic operators with application for efficient VLSI implementation of cellular neural networks
    Dogaru, R
    Crounse, KR
    Chua, LO
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 1998, 45 (07) : 745 - 753
  • [37] Application of neural networks to static equivalent networks
    Müller, V
    Nelles, D
    EUROPEAN TRANSACTIONS ON ELECTRICAL POWER, 2002, 12 (03): : 217 - 223
  • [38] Efficient Evolution of ART Neural Networks
    Kaylani, A.
    Georgiopoulos, M.
    Mollaghasemi, M.
    Anagnostopoulos, G. C.
    2008 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-8, 2008, : 3456 - +
  • [39] Efficient Scaling of Bayesian Neural Networks
    Epifano, Jacob R.
    Duong, Timothy
    Ramachandran, Ravi P.
    Rasool, Ghulam
    IEEE ACCESS, 2024, 12 : 150953 - 150961
  • [40] Efficient training of backpropagation neural networks
    Otair, Mohammed A.
    Salameh, Walid A.
    NEURAL NETWORK WORLD, 2006, 16 (04) : 291 - 311