Dynamics of learning near singularities in layered networks.

Abstract:

:We explicitly analyze the trajectories of learning near singularities in hierarchical networks, such as multilayer perceptrons and radial basis function networks, which include permutation symmetry of hidden nodes, and show their general properties. Such symmetry induces singularities in their parameter space, where the Fisher information matrix degenerates and odd learning behaviors, especially the existence of plateaus in gradient descent learning, arise due to the geometric structure of singularity. We plot dynamic vector fields to demonstrate the universal trajectories of learning near singularities. The singularity induces two types of plateaus, the on-singularity plateau and the near-singularity plateau, depending on the stability of the singularity and the initial parameters of learning. The results presented in this letter are universally applicable to a wide class of hierarchical models. Detailed stability analysis of the dynamics of learning in radial basis function networks and multilayer perceptrons will be presented in separate work.

journal_name

Neural Comput

journal_title

Neural computation

authors

Wei H,Zhang J,Cousseau F,Ozeki T,Amari S

doi

10.1162/neco.2007.12-06-414

subject

Has Abstract

pub_date

2008-03-01 00:00:00

pages

813-43

issue

3

eissn

0899-7667

issn

1530-888X

journal_volume

20

pub_type

信件
  • A Distributed Framework for the Construction of Transport Maps.

    abstract::The need to reason about uncertainty in large, complex, and multimodal data sets has become increasingly common across modern scientific environments. The ability to transform samples from one distribution P to another distribution

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01172

    authors: Mesa DA,Tantiongloc J,Mendoza M,Kim S,P Coleman T

    更新日期:2019-04-01 00:00:00

  • Connection topology selection in central pattern generators by maximizing the gain of information.

    abstract::A study of a general central pattern generator (CPG) is carried out by means of a measure of the gain of information between the number of available topology configurations and the output rhythmic activity. The neurons of the CPG are chaotic Hindmarsh-Rose models that cooperate dynamically to generate either chaotic o...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2007.19.4.974

    authors: Stiesberg GR,Reyes MB,Varona P,Pinto RD,Huerta R

    更新日期:2007-04-01 00:00:00

  • Capturing the Dynamical Repertoire of Single Neurons with Generalized Linear Models.

    abstract::A key problem in computational neuroscience is to find simple, tractable models that are nevertheless flexible enough to capture the response properties of real neurons. Here we examine the capabilities of recurrent point process models known as Poisson generalized linear models (GLMs). These models are defined by a s...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01021

    authors: Weber AI,Pillow JW

    更新日期:2017-12-01 00:00:00

  • Nonmonotonic generalization bias of Gaussian mixture models.

    abstract::Theories of learning and generalization hold that the generalization bias, defined as the difference between the training error and the generalization error, increases on average with the number of adaptive parameters. This article, however, shows that this general tendency is violated for a gaussian mixture model. Fo...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976600300015439

    authors: Akaho S,Kappen HJ

    更新日期:2000-06-01 00:00:00

  • Optimality of Upper-Arm Reaching Trajectories Based on the Expected Value of the Metabolic Energy Cost.

    abstract::When we move our body to perform a movement task, our central nervous system selects a movement trajectory from an infinite number of possible trajectories under constraints that have been acquired through evolution and learning. Minimization of the energy cost has been suggested as a potential candidate for a constra...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00757

    authors: Taniai Y,Nishii J

    更新日期:2015-08-01 00:00:00

  • Propagating distributions up directed acyclic graphs.

    abstract::In a previous article, we considered game trees as graphical models. Adopting an evaluation function that returned a probability distribution over values likely to be taken at a given position, we described how to build a model of uncertainty and use it for utility-directed growth of the search tree and for deciding o...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976699300016881

    authors: Baum EB,Smith WD

    更新日期:1999-01-01 00:00:00

  • Dependence of neuronal correlations on filter characteristics and marginal spike train statistics.

    abstract::Correlated neural activity has been observed at various signal levels (e.g., spike count, membrane potential, local field potential, EEG, fMRI BOLD). Most of these signals can be considered as superpositions of spike trains filtered by components of the neural system (synapses, membranes) and the measurement process. ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2008.05-07-525

    authors: Tetzlaff T,Rotter S,Stark E,Abeles M,Aertsen A,Diesmann M

    更新日期:2008-09-01 00:00:00

  • Parsing Complex Sentences with Structured Connectionist Networks.

    abstract::A modular, recurrent connectionist network is taught to incrementally parse complex sentences. From input presented one word at a time, the network learns to do semantic role assignment, noun phrase attachment, and clause structure recognition, for sentences with both active and passive constructions and center-embedd...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.1991.3.1.110

    authors: Jain AN

    更新日期:1991-04-01 00:00:00

  • Discriminant component pruning. Regularization and interpretation of multi-layered back-propagation networks.

    abstract::Neural networks are often employed as tools in classification tasks. The use of large networks increases the likelihood of the task's being learned, although it may also lead to increased complexity. Pruning is an effective way of reducing the complexity of large networks. We present discriminant components pruning (D...

    journal_title:Neural computation

    pub_type: 杂志文章,评审

    doi:10.1162/089976699300016665

    authors: Koene RA,Takane Y

    更新日期:1999-04-01 00:00:00

  • Whence the Expected Free Energy?

    abstract::The expected free energy (EFE) is a central quantity in the theory of active inference. It is the quantity that all active inference agents are mandated to minimize through action, and its decomposition into extrinsic and intrinsic value terms is key to the balance of exploration and exploitation that active inference...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01354

    authors: Millidge B,Tschantz A,Buckley CL

    更新日期:2021-01-05 00:00:00

  • Estimating spiking irregularities under changing environments.

    abstract::We considered a gamma distribution of interspike intervals as a statistical model for neuronal spike generation. A gamma distribution is a natural extension of the Poisson process taking the effect of a refractory period into account. The model is specified by two parameters: a time-dependent firing rate and a shape p...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2006.18.10.2359

    authors: Miura K,Okada M,Amari S

    更新日期:2006-10-01 00:00:00

  • Parameter Identifiability in Statistical Machine Learning: A Review.

    abstract::This review examines the relevance of parameter identifiability for statistical models used in machine learning. In addition to defining main concepts, we address several issues of identifiability closely related to machine learning, showing the advantages and disadvantages of state-of-the-art research and demonstrati...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00947

    authors: Ran ZY,Hu BG

    更新日期:2017-05-01 00:00:00

  • A first-order nonhomogeneous Markov model for the response of spiking neurons stimulated by small phase-continuous signals.

    abstract::We present a first-order nonhomogeneous Markov model for the interspike-interval density of a continuously stimulated spiking neuron. The model allows the conditional interspike-interval density and the stationary interspike-interval density to be expressed as products of two separate functions, one of which describes...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2009.06-07-548

    authors: Tapson J,Jin C,van Schaik A,Etienne-Cummings R

    更新日期:2009-06-01 00:00:00

  • Topographic mapping of large dissimilarity data sets.

    abstract::Topographic maps such as the self-organizing map (SOM) or neural gas (NG) constitute powerful data mining techniques that allow simultaneously clustering data and inferring their topological structure, such that additional features, for example, browsing, become available. Both methods have been introduced for vectori...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00012

    authors: Hammer B,Hasenfuss A

    更新日期:2010-09-01 00:00:00

  • A Reservoir Computing Model of Reward-Modulated Motor Learning and Automaticity.

    abstract::Reservoir computing is a biologically inspired class of learning algorithms in which the intrinsic dynamics of a recurrent neural network are mined to produce target time series. Most existing reservoir computing algorithms rely on fully supervised learning rules, which require access to an exact copy of the target re...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01198

    authors: Pyle R,Rosenbaum R

    更新日期:2019-07-01 00:00:00

  • Abstract stimulus-specific adaptation models.

    abstract::Many neurons that initially respond to a stimulus stop responding if the stimulus is presented repeatedly but recover their response if a different stimulus is presented. This phenomenon is referred to as stimulus-specific adaptation (SSA). SSA has been investigated extensively using oddball experiments, which measure...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00077

    authors: Mill R,Coath M,Wennekers T,Denham SL

    更新日期:2011-02-01 00:00:00

  • Sequential triangle strip generator based on Hopfield networks.

    abstract::The important task of generating the minimum number of sequential triangle strips (tristrips) for a given triangulated surface model is motivated by applications in computer graphics. This hard combinatorial optimization problem is reduced to the minimum energy problem in Hopfield nets by a linear-size construction. I...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2008.10-07-623

    authors: Síma J,Lnĕnicka R

    更新日期:2009-02-01 00:00:00

  • On the problem in model selection of neural network regression in overrealizable scenario.

    abstract::In considering a statistical model selection of neural networks and radial basis functions under an overrealizable case, the problem of unidentifiability emerges. Because the model selection criterion is an unbiased estimator of the generalization error based on the training error, this article analyzes the expected t...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976602760128090

    authors: Hagiwara K

    更新日期:2002-08-01 00:00:00

  • Positive Neural Networks in Discrete Time Implement Monotone-Regular Behaviors.

    abstract::We study the expressive power of positive neural networks. The model uses positive connection weights and multiple input neurons. Different behaviors can be expressed by varying the connection weights. We show that in discrete time and in the absence of noise, the class of positive neural networks captures the so-call...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00789

    authors: Ameloot TJ,Van den Bussche J

    更新日期:2015-12-01 00:00:00

  • Deficient GABAergic gliotransmission may cause broader sensory tuning in schizophrenia.

    abstract::We examined how the depression of intracortical inhibition due to a reduction in ambient GABA concentration impairs perceptual information processing in schizophrenia. A neural network model with a gliotransmission-mediated ambient GABA regulatory mechanism was simulated. In the network, interneuron-to-glial-cell and ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00519

    authors: Hoshino O

    更新日期:2013-12-01 00:00:00

  • Investigating the fault tolerance of neural networks.

    abstract::Particular levels of partial fault tolerance (PFT) in feedforward artificial neural networks of a given size can be obtained by redundancy (replicating a smaller normally trained network), by design (training specifically to increase PFT), and by a combination of the two (replicating a smaller PFT-trained network). Th...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/0899766053723096

    authors: Tchernev EB,Mulvaney RG,Phatak DS

    更新日期:2005-07-01 00:00:00

  • Nonlinear Time&hyphenSeries Prediction with Missing and Noisy Data

    abstract::We derive solutions for the problem of missing and noisy data in nonlinear time&hyphenseries prediction from a probabilistic point of view. We discuss different approximations to the solutions &hyphen in particular, approximations that require either stochastic simulation or the substitution of a single estimate for t...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976698300017728

    authors: Tresp V V,Hofmann R

    更新日期:1998-03-23 00:00:00

  • On the slow convergence of EM and VBEM in low-noise linear models.

    abstract::We analyze convergence of the expectation maximization (EM) and variational Bayes EM (VBEM) schemes for parameter estimation in noisy linear models. The analysis shows that both schemes are inefficient in the low-noise limit. The linear model with additive noise includes as special cases independent component analysis...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/0899766054322991

    authors: Petersen KB,Winther O,Hansen LK

    更新日期:2005-09-01 00:00:00

  • Analyzing and Accelerating the Bottlenecks of Training Deep SNNs With Backpropagation.

    abstract::Spiking neural networks (SNNs) with the event-driven manner of transmitting spikes consume ultra-low power on neuromorphic chips. However, training deep SNNs is still challenging compared to convolutional neural networks (CNNs). The SNN training algorithms have not achieved the same performance as CNNs. In this letter...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01319

    authors: Chen R,Li L

    更新日期:2020-12-01 00:00:00

  • The Deterministic Information Bottleneck.

    abstract::Lossy compression and clustering fundamentally involve a decision about which features are relevant and which are not. The information bottleneck method (IB) by Tishby, Pereira, and Bialek ( 1999 ) formalized this notion as an information-theoretic optimization problem and proposed an optimal trade-off between throwin...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00961

    authors: Strouse DJ,Schwab DJ

    更新日期:2017-06-01 00:00:00

  • Higher-order statistics of input ensembles and the response of simple model neurons.

    abstract::Pairwise correlations among spike trains recorded in vivo have been frequently reported. It has been argued that correlated activity could play an important role in the brain, because it efficiently modulates the response of a postsynaptic neuron. We show here that a neuron's output firing rate critically depends on t...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976603321043702

    authors: Kuhn A,Aertsen A,Rotter S

    更新日期:2003-01-01 00:00:00

  • Insect-inspired estimation of egomotion.

    abstract::Tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during egomotion. In this study, we examine whether a simplified linear model based on the organization principles in tangential neurons can be used to estimate egomotion from the optic flow. We present a theory for the cons...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/0899766041941899

    authors: Franz MO,Chahl JS,Krapp HG

    更新日期:2004-11-01 00:00:00

  • Capturing the Forest but Missing the Trees: Microstates Inadequate for Characterizing Shorter-Scale EEG Dynamics.

    abstract::The brain is known to be active even when not performing any overt cognitive tasks, and often it engages in involuntary mind wandering. This resting state has been extensively characterized in terms of fMRI-derived brain networks. However, an alternate method has recently gained popularity: EEG microstate analysis. Pr...

    journal_title:Neural computation

    pub_type: 信件

    doi:10.1162/neco_a_01229

    authors: Shaw SB,Dhindsa K,Reilly JP,Becker S

    更新日期:2019-11-01 00:00:00

  • Rapid processing and unsupervised learning in a model of the cortical macrocolumn.

    abstract::We study a model of the cortical macrocolumn consisting of a collection of inhibitorily coupled minicolumns. The proposed system overcomes several severe deficits of systems based on single neurons as cerebral functional units, notably limited robustness to damage and unrealistically large computation time. Motivated ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976604772744893

    authors: Lücke J,von der Malsburg C

    更新日期:2004-03-01 00:00:00

  • The relationship between synchronization among neuronal populations and their mean activity levels.

    abstract::In the past decade the importance of synchronized dynamics in the brain has emerged from both empirical and theoretical perspectives. Fast dynamic synchronous interactions of an oscillatory or nonoscillatory nature may constitute a form of temporal coding that underlies feature binding and perceptual synthesis. The re...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976699300016287

    authors: Chawla D,Lumer ED,Friston KJ

    更新日期:1999-08-15 00:00:00