State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions

被引：1

作者：

Wang, Cheng ^{[1
]}

Lawrence, Carolin ^{[2
]}

Niepert, Mathias ^{[2
,3
]}

机构：

[1] Amazon, D-10117 Berlin, Germany

[2] NEC Labs Europe, D-69115 Heidelberg, Germany

[3] Univ Stuttgart, D-70174 Stuttgart, Germany

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 06期

关键词：

Stochastic processes; Logic gates; Learning automata; Behavioral sciences; Probabilistic logic; Symbols; Memory management; Automata extraction; explainability; interpretability; memorization; recurrent neural networks; state machine;

D O I：

10.1109/TPAMI.2022.3225334

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recurrent neural networks are a widely used class of neural architectures. They have, however, two shortcomings. First, they are often treated as black-box models and as such it is difficult to understand what exactly they learn as well as how they arrive at a particular prediction. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in principle. We aim to address both shortcomings with a class of recurrent networks that use a stochastic state transition mechanism between cell applications. This mechanism, which we term state-regularization, makes RNNs transition between a finite set of learnable states. We evaluate state-regularized RNNs on (1) regular languages for the purpose of automata extraction; (2) non-regular languages such as balanced parentheses and palindromes where external memory is required; and (3) real-word sequence learning tasks for sentiment analysis, visual object recognition and text categorisation. We show that state-regularization (a) simplifies the extraction of finite state automata that display an RNN's state transition dynamic; (b) forces RNNs to operate more like automata with external memory and less like finite state machines, which potentiality leads to a more structural memory; (c) leads to better interpretability and explainability of RNNs by leveraging the probabilistic finite state transition mechanism over time steps.

引用

页码：7739 / 7750

页数：12

共 50 条

[1] State-Regularized Recurrent Neural Networks
Wang, Cheng
Niepert, Mathias
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[2] Representation and identification of finite state automata by recurrent neural networks
Kuroe, Y
NEURAL INFORMATION PROCESSING, 2004, 3316 : 261 - 268
[3] Identification of Finite State Automata With a Class of Recurrent Neural Networks
Won, Sung Hwan
Song, Iickho
Lee, Sun Young
Park, Cheol Hoon
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (09): : 1408 - 1421
[4] Constructing deterministic finite-state automata in recurrent neural networks
Omlin, CW
Giles, CL
JOURNAL OF THE ACM, 1996, 43 (06) : 937 - 972
[5] Training and extraction of fuzzy finite state automata in recurrent neural networks
Chandra, Rohitash
Omlin, Christian W.
PROCEEDINGS OF THE SECOND IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2006, : 271 - 275
[6] Representation of fuzzy finite state automata in continuous recurrent neural networks
Omlin, CW
Thornber, KK
Giles, CL
ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1023 - 1027
[7] Recurrent neural networks and finite automata
Siegelmann, HT
COMPUTATIONAL INTELLIGENCE, 1996, 12 (04) : 567 - 574
[8] Adaptive β scheduling learning method of finite state automata by recurrent neural networks
Arai, K
Nakano, R
PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 351 - 354
[9] Learning minimal automata with recurrent neural networks
Aichernig, Bernhard K.
Koenig, Sandra
Mateis, Cristinel
Pferscher, Andrea
Tappler, Martin
SOFTWARE AND SYSTEMS MODELING, 2024, 23 (03): : 625 - 655
[10] Finite State Automata and Simple Recurrent Networks
Cleeremans, Axel
Servan-Schreiber, David
McClelland, James L.
NEURAL COMPUTATION, 1989, 1 (03) : 372 - 381

← 1 2 3 4 5 →