Abstract:
:We explicitly analyze the trajectories of learning near singularities in hierarchical networks, such as multilayer perceptrons and radial basis function networks, which include permutation symmetry of hidden nodes, and show their general properties. Such symmetry induces singularities in their parameter space, where the Fisher information matrix degenerates and odd learning behaviors, especially the existence of plateaus in gradient descent learning, arise due to the geometric structure of singularity. We plot dynamic vector fields to demonstrate the universal trajectories of learning near singularities. The singularity induces two types of plateaus, the on-singularity plateau and the near-singularity plateau, depending on the stability of the singularity and the initial parameters of learning. The results presented in this letter are universally applicable to a wide class of hierarchical models. Detailed stability analysis of the dynamics of learning in radial basis function networks and multilayer perceptrons will be presented in separate work.
journal_name
Neural Computjournal_title
Neural computationauthors
Wei H,Zhang J,Cousseau F,Ozeki T,Amari Sdoi
10.1162/neco.2007.12-06-414subject
Has Abstractpub_date
2008-03-01 00:00:00pages
813-43issue
3eissn
0899-7667issn
1530-888Xjournal_volume
20pub_type
信件abstract::The need to reason about uncertainty in large, complex, and multimodal data sets has become increasingly common across modern scientific environments. The ability to transform samples from one distribution journal_title:Neural computation pub_type: 杂志文章 doi:10.1162/neco_a_01172 更新日期:2019-04-01 00:00:00
abstract::A study of a general central pattern generator (CPG) is carried out by means of a measure of the gain of information between the number of available topology configurations and the output rhythmic activity. The neurons of the CPG are chaotic Hindmarsh-Rose models that cooperate dynamically to generate either chaotic o...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2007.19.4.974
更新日期:2007-04-01 00:00:00
abstract::A key problem in computational neuroscience is to find simple, tractable models that are nevertheless flexible enough to capture the response properties of real neurons. Here we examine the capabilities of recurrent point process models known as Poisson generalized linear models (GLMs). These models are defined by a s...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01021
更新日期:2017-12-01 00:00:00
abstract::Theories of learning and generalization hold that the generalization bias, defined as the difference between the training error and the generalization error, increases on average with the number of adaptive parameters. This article, however, shows that this general tendency is violated for a gaussian mixture model. Fo...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976600300015439
更新日期:2000-06-01 00:00:00
abstract::When we move our body to perform a movement task, our central nervous system selects a movement trajectory from an infinite number of possible trajectories under constraints that have been acquired through evolution and learning. Minimization of the energy cost has been suggested as a potential candidate for a constra...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00757
更新日期:2015-08-01 00:00:00
abstract::In a previous article, we considered game trees as graphical models. Adopting an evaluation function that returned a probability distribution over values likely to be taken at a given position, we described how to build a model of uncertainty and use it for utility-directed growth of the search tree and for deciding o...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976699300016881
更新日期:1999-01-01 00:00:00
abstract::Correlated neural activity has been observed at various signal levels (e.g., spike count, membrane potential, local field potential, EEG, fMRI BOLD). Most of these signals can be considered as superpositions of spike trains filtered by components of the neural system (synapses, membranes) and the measurement process. ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2008.05-07-525
更新日期:2008-09-01 00:00:00
abstract::A modular, recurrent connectionist network is taught to incrementally parse complex sentences. From input presented one word at a time, the network learns to do semantic role assignment, noun phrase attachment, and clause structure recognition, for sentences with both active and passive constructions and center-embedd...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.1991.3.1.110
更新日期:1991-04-01 00:00:00
abstract::Neural networks are often employed as tools in classification tasks. The use of large networks increases the likelihood of the task's being learned, although it may also lead to increased complexity. Pruning is an effective way of reducing the complexity of large networks. We present discriminant components pruning (D...
journal_title:Neural computation
pub_type: 杂志文章,评审
doi:10.1162/089976699300016665
更新日期:1999-04-01 00:00:00
abstract::The expected free energy (EFE) is a central quantity in the theory of active inference. It is the quantity that all active inference agents are mandated to minimize through action, and its decomposition into extrinsic and intrinsic value terms is key to the balance of exploration and exploitation that active inference...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01354
更新日期:2021-01-05 00:00:00
abstract::We considered a gamma distribution of interspike intervals as a statistical model for neuronal spike generation. A gamma distribution is a natural extension of the Poisson process taking the effect of a refractory period into account. The model is specified by two parameters: a time-dependent firing rate and a shape p...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2006.18.10.2359
更新日期:2006-10-01 00:00:00
abstract::This review examines the relevance of parameter identifiability for statistical models used in machine learning. In addition to defining main concepts, we address several issues of identifiability closely related to machine learning, showing the advantages and disadvantages of state-of-the-art research and demonstrati...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00947
更新日期:2017-05-01 00:00:00
abstract::We present a first-order nonhomogeneous Markov model for the interspike-interval density of a continuously stimulated spiking neuron. The model allows the conditional interspike-interval density and the stationary interspike-interval density to be expressed as products of two separate functions, one of which describes...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2009.06-07-548
更新日期:2009-06-01 00:00:00
abstract::Topographic maps such as the self-organizing map (SOM) or neural gas (NG) constitute powerful data mining techniques that allow simultaneously clustering data and inferring their topological structure, such that additional features, for example, browsing, become available. Both methods have been introduced for vectori...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00012
更新日期:2010-09-01 00:00:00
abstract::Reservoir computing is a biologically inspired class of learning algorithms in which the intrinsic dynamics of a recurrent neural network are mined to produce target time series. Most existing reservoir computing algorithms rely on fully supervised learning rules, which require access to an exact copy of the target re...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01198
更新日期:2019-07-01 00:00:00
abstract::Many neurons that initially respond to a stimulus stop responding if the stimulus is presented repeatedly but recover their response if a different stimulus is presented. This phenomenon is referred to as stimulus-specific adaptation (SSA). SSA has been investigated extensively using oddball experiments, which measure...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00077
更新日期:2011-02-01 00:00:00
abstract::The important task of generating the minimum number of sequential triangle strips (tristrips) for a given triangulated surface model is motivated by applications in computer graphics. This hard combinatorial optimization problem is reduced to the minimum energy problem in Hopfield nets by a linear-size construction. I...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2008.10-07-623
更新日期:2009-02-01 00:00:00
abstract::In considering a statistical model selection of neural networks and radial basis functions under an overrealizable case, the problem of unidentifiability emerges. Because the model selection criterion is an unbiased estimator of the generalization error based on the training error, this article analyzes the expected t...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976602760128090
更新日期:2002-08-01 00:00:00
abstract::We study the expressive power of positive neural networks. The model uses positive connection weights and multiple input neurons. Different behaviors can be expressed by varying the connection weights. We show that in discrete time and in the absence of noise, the class of positive neural networks captures the so-call...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00789
更新日期:2015-12-01 00:00:00
abstract::We examined how the depression of intracortical inhibition due to a reduction in ambient GABA concentration impairs perceptual information processing in schizophrenia. A neural network model with a gliotransmission-mediated ambient GABA regulatory mechanism was simulated. In the network, interneuron-to-glial-cell and ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00519
更新日期:2013-12-01 00:00:00
abstract::Particular levels of partial fault tolerance (PFT) in feedforward artificial neural networks of a given size can be obtained by redundancy (replicating a smaller normally trained network), by design (training specifically to increase PFT), and by a combination of the two (replicating a smaller PFT-trained network). Th...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/0899766053723096
更新日期:2005-07-01 00:00:00
abstract::We derive solutions for the problem of missing and noisy data in nonlinear time&hyphenseries prediction from a probabilistic point of view. We discuss different approximations to the solutions &hyphen in particular, approximations that require either stochastic simulation or the substitution of a single estimate for t...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976698300017728
更新日期:1998-03-23 00:00:00
abstract::We analyze convergence of the expectation maximization (EM) and variational Bayes EM (VBEM) schemes for parameter estimation in noisy linear models. The analysis shows that both schemes are inefficient in the low-noise limit. The linear model with additive noise includes as special cases independent component analysis...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/0899766054322991
更新日期:2005-09-01 00:00:00
abstract::Spiking neural networks (SNNs) with the event-driven manner of transmitting spikes consume ultra-low power on neuromorphic chips. However, training deep SNNs is still challenging compared to convolutional neural networks (CNNs). The SNN training algorithms have not achieved the same performance as CNNs. In this letter...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01319
更新日期:2020-12-01 00:00:00
abstract::Lossy compression and clustering fundamentally involve a decision about which features are relevant and which are not. The information bottleneck method (IB) by Tishby, Pereira, and Bialek ( 1999 ) formalized this notion as an information-theoretic optimization problem and proposed an optimal trade-off between throwin...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00961
更新日期:2017-06-01 00:00:00
abstract::Pairwise correlations among spike trains recorded in vivo have been frequently reported. It has been argued that correlated activity could play an important role in the brain, because it efficiently modulates the response of a postsynaptic neuron. We show here that a neuron's output firing rate critically depends on t...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976603321043702
更新日期:2003-01-01 00:00:00
abstract::Tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during egomotion. In this study, we examine whether a simplified linear model based on the organization principles in tangential neurons can be used to estimate egomotion from the optic flow. We present a theory for the cons...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/0899766041941899
更新日期:2004-11-01 00:00:00
abstract::The brain is known to be active even when not performing any overt cognitive tasks, and often it engages in involuntary mind wandering. This resting state has been extensively characterized in terms of fMRI-derived brain networks. However, an alternate method has recently gained popularity: EEG microstate analysis. Pr...
journal_title:Neural computation
pub_type: 信件
doi:10.1162/neco_a_01229
更新日期:2019-11-01 00:00:00
abstract::We study a model of the cortical macrocolumn consisting of a collection of inhibitorily coupled minicolumns. The proposed system overcomes several severe deficits of systems based on single neurons as cerebral functional units, notably limited robustness to damage and unrealistically large computation time. Motivated ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976604772744893
更新日期:2004-03-01 00:00:00
abstract::In the past decade the importance of synchronized dynamics in the brain has emerged from both empirical and theoretical perspectives. Fast dynamic synchronous interactions of an oscillatory or nonoscillatory nature may constitute a form of temporal coding that underlies feature binding and perceptual synthesis. The re...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976699300016287
更新日期:1999-08-15 00:00:00