Information geometry of the EM and em algorithms for neural networks

被引:177
|
作者
Amari, SI
机构
关键词
EM algorithm; information geometry; stochastic model of neural networks; learning; identification of neural network; e-projection; m-projection; hidden variable;
D O I
10.1016/0893-6080(95)00003-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To realize an input-output relation given by noise-contaminated examples, it is effective to use a stochastic model of neural networks. When the model network includes hidden units whose activation values are not specified nor observed, it is useful to estimate the hidden variables from the observed or specified input-output data based on the stochastic model. Two algorithms, the EM and em algorithms, have so far been proposed for this purpose. The EM algorithm is an iterative statistical technique of using the conditional expectation, and the em algorithm is a geometrical one given by information geometry. The em algorithm minimizes iteratively the Kullback-Leibler divergence in the manifold of neural networks. These two algorithms are equivalent in most cases. The present paper gives a unified information geometrical framework for studying stochastic models of neural networks, by focusing on the EM and em algorithms, and proves a condition that guarantees their equivalence. Examples include: (1) stochastic multilayer perceptron, (2) mixtures of experts, and (3) normal mixture model.
引用
收藏
页码:1379 / 1408
页数:30
相关论文
共 50 条
  • [31] EM algorithms for generalizing MCE and FOCUSS
    Wipf, David
    Sekihara, Kensuke
    Nagarajan, Srikantan
    2007 JOINT MEETING OF THE 6TH INTERNATIONAL SYMPOSIUM ON NONINVASIVE FUNCTIONAL SOURCE IMAGING OF THE BRAIN AND HEART AND THE INTERNATIONAL CONFERENCE ON FUNCTIONAL BIOMEDICAL IMAGING, 2007, : 233 - +
  • [32] Parameter convergence for EM and MM algorithms
    Vaida, F
    STATISTICA SINICA, 2005, 15 (03) : 831 - 840
  • [33] EM algorithms for beta kernel distributions
    Teimouri, Mahdi
    Nadarajah, Saralees
    Shih, Shou Hsing
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2014, 84 (02) : 451 - 467
  • [34] Non-logarithmic information measures, α-weighted EM algorithms and speedup of learning
    Matsuyama, Y
    1998 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY - PROCEEDINGS, 1998, : 385 - 385
  • [35] Nesting EM algorithms for computational efficiency
    van Dyk, DA
    STATISTICA SINICA, 2000, 10 (01) : 203 - 225
  • [36] Recent Advances in Artificial neural networks for EM parameterized modeling and optimization
    Ma, Li
    Zhang, Rattan
    Yan, Shuxia
    Zhang, Qijun
    2022 16TH EUROPEAN CONFERENCE ON ANTENNAS AND PROPAGATION (EUCAP), 2022,
  • [37] Neural Networks and Evolutionary Algorithm Application to complex EM Structures Modeling
    Mussetta, M.
    Caputo, D.
    Pirisi, A.
    Grimaccia, F.
    Valbonesi, L.
    Zich, R. E.
    ICEAA: 2009 INTERNATIONAL CONFERENCE ON ELECTROMAGNETICS IN ADVANCED APPLICATIONS, VOLS 1 AND 2, 2009, : 1021 - +
  • [38] An efficient EM-based training algorithm for feedforward neural networks
    Ma, S
    Ji, CY
    Farmer, J
    NEURAL NETWORKS, 1997, 10 (02) : 243 - 256
  • [39] The α-EM algorithm:: A block connectable generalized leaning tool for neural networks
    Matsuyama, Y
    BIOLOGICAL AND ARTIFICIAL COMPUTATION: FROM NEUROSCIENCE TO TECHNOLOGY, 1997, 1240 : 483 - 492
  • [40] Segmentation of EM showers for neutrino experiments with deep graph neural networks
    Belavin, Vladislav
    Trofimova, Ekaterina
    Ustyuzhanin, Andrey
    arXiv, 2021,