LSTM recurrent networks learn simple context-free and context-sensitive languages

被引:467
作者
Gers, FA [1 ]
Schtmidhuber, J [1 ]
机构
[1] IDSIA, CH-6928 Manno, Switzerland
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2001年 / 12卷 / 06期
关键词
context-free languages (CFLs); context-sensitive languages (CSLs); long short-term memory (LSTM); recurrent neural networks (RNNs);
D O I
10.1109/72.963769
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work on learning regular languages from exemplary training sequences showed that long short-term memory (LSTM) outperforms traditional recurrent neural networks (RNNs). Here we demonstrate LSTMs superior performance on context-free language (CFL) benchmarks for RNNs, and show that it works even better than previous hardwired or highly specialized architectures. To the best of our knowledge, LSTM variants are also the first RNNs to learn a simple context-sensitive language (CSL), namely a(n)b(n)c(n).
引用
收藏
页码:1333 / 1340
页数:8
相关论文
共 32 条
[1]  
[Anonymous], PROCEEDINGS OF THE S
[2]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[3]   Analysis of dynamical recognizers [J].
Blair, AD ;
Pollack, JB .
NEURAL COMPUTATION, 1997, 9 (05) :1127-1142
[4]   The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction [J].
Casey, M .
NEURAL COMPUTATION, 1996, 8 (06) :1135-1178
[5]  
DAS S, 1992, PROCEEDINGS OF THE FOURTEENTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, P791
[6]   Learning to forget: Continual prediction with LSTM [J].
Gers, FA ;
Schmidhuber, J ;
Cummins, F .
NEURAL COMPUTATION, 2000, 12 (10) :2451-2471
[7]  
GERS FA, 2000, P IJCNN 2000 INT JOI
[8]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[9]  
HOCHREITER S, 1991, THESIS TECHN U MUNCH
[10]  
KALINKE Y, 1998, P 11 AUSTR JOINT C A, V1502