Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network

被引：2238

作者：

Sherstinsky, Alex

机构：

来源：

PHYSICA D-NONLINEAR PHENOMENA | 2020年 / 404卷 / 404期

关键词：

RNN; RNN unfolding/unrolling; LSTM; External input gate; Convolutional input context windows; BACKPROPAGATION;

D O I：

10.1016/j.physd.2019.132306

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Because of their effectiveness in broad practical applications, LSTM networks have received a wealth of coverage in scientific journals, technical blogs, and implementation guides. However, in most articles, the inference formulas for the LSTM network and its parent, RNN, are stated axiomatically, while the training formulas are omitted altogether. In addition, the technique of "unrolling'' an RNN is routinely presented without justification throughout the literature. The goal of this tutorial is to explain the essential RNN and LSTM fundamentals in a single document. Drawing from concepts in Signal Processing, we formally derive the canonical RNN formulation from differential equations. We then propose and prove a precise statement, which yields the RNN unrolling technique. We also review the difficulties with training the standard RNN and address them by transforming the RNN into the "Vanilla LSTM''1 network through a series of logical arguments. We provide all equations pertaining to the LSTM system together with detailed descriptions of its constituent entities. Albeit unconventional, our choice of notation and the method for presenting the LSTM system emphasizes ease of understanding. As part of the analysis, we identify new opportunities to enrich the LSTM system and incorporate these extensions into the Vanilla LSTM network, producing the most general LSTM variant to date. The target reader has already been exposed to RNNs and LSTM networks through numerous available resources and is open to an alternative pedagogical approach. A Machine Learning practitioner seeking guidance for implementing our new augmented LSTM model in software for experimentation and research will find the insights and derivations in this treatise valuable as well. (C) 2019 Elsevier B.V. All rights reserved.

引用

页数：28

共 50 条

[31] Spam SMS Detection Based on Long Short-Term Memory and Recurrent Neural Network
Alseid, Marya
Nassif, Ali Bou
AlShabi, Mohammad
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS V, 2023, 12538
[32] Forecasting of the Stock Price Using Recurrent Neural Network - Long Short-term Memory
Dobrovolny, Michal
Soukal, Ivan
Salamat, Ali
Cierniak-Emerych, Anna
Krejcar, Ondrej
HRADEC ECONOMIC DAYS, VOL 11(1), 2021, 11 : 145 - 154
[33] Forecasting cryptocurrency prices using Recurrent Neural Network and Long Short-term Memory
Nasirtafreshi, I.
DATA & KNOWLEDGE ENGINEERING, 2022, 139
[34] Chemical Substance Classification Using Long Short-Term Memory Recurrent Neural Network
Zhang, Jinlei
Liu, Junxiu
Luo, Yuling
Fu, Qiang
Bi, Jinjie
Qiu, Senhui
Cao, Yi
Ding, Xuemei
2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 1994 - 1997
[35] An Interpretation of Long Short-Term Memory Recurrent Neural Network for Approximating Roots of Polynomials
Bukhsh, Madiha
Ali, Muhammad Saqib
Ashraf, Muhammad Usman
Alsubhi, Khalid
Chen, Weiqiu
IEEE ACCESS, 2022, 10 : 28194 - 28205
[36] Long short-term memory recurrent neural network for pharmacokinetic-pharmacodynamic modeling
Liu, Xiangyu
Liu, Chao
Huang, Ruihao
Zhu, Hao
Liu, Qi
Mitra, Sunanda
Wang, Yaning
INTERNATIONAL JOURNAL OF CLINICAL PHARMACOLOGY AND THERAPEUTICS, 2021, 59 (02) : 138 - 146
[37] Work in Progress Level Prediction with Long Short-Term Memory Recurrent Neural Network
Gallina, Viola
Lingitz, Lukas
Breitschopf, Johannes
Zudor, Elisabeth
Sihn, Wilfried
10TH CIRP SPONSORED CONFERENCE ON DIGITAL ENTERPRISE TECHNOLOGIES (DET 2020) - DIGITAL TECHNOLOGIES AS ENABLERS OF INDUSTRIAL COMPETITIVENESS AND SUSTAINABILITY, 2021, 54 : 136 - 141
[38] Long short-term memory recurrent neural network architectures for Urdu acoustic modeling
Tehseen Zia
Usman Zahid
International Journal of Speech Technology, 2019, 22 : 21 - 30
[39] Monitoring ICU Mortality Risk with A Long Short-Term Memory Recurrent Neural Network
Yu, Ke
Zhang, Mingda
Cui, Tianyi
Hauskrecht, Milos
PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020, 2020, : 103 - 114
[40] Long short-term memory recurrent neural network architectures for Urdu acoustic modeling
Zia, Tehseen
Zahid, Usman
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (01) : 21 - 30

← 1 2 3 4 5 →