Abstract:
:We formulate an equivalence between machine learning and the formulation of statistical data assimilation as used widely in physical and biological sciences. The correspondence is that layer number in a feedforward artificial network setting is the analog of time in the data assimilation setting. This connection has been noted in the machine learning literature. We add a perspective that expands on how methods from statistical physics and aspects of Lagrangian and Hamiltonian dynamics play a role in how networks can be trained and designed. Within the discussion of this equivalence, we show that adding more layers (making the network deeper) is analogous to adding temporal resolution in a data assimilation framework. Extending this equivalence to recurrent networks is also discussed. We explore how one can find a candidate for the global minimum of the cost functions in the machine learning context using a method from data assimilation. Calculations on simple models from both sides of the equivalence are reported. Also discussed is a framework in which the time or layer label is taken to be continuous, providing a differential equation, the Euler-Lagrange equation and its boundary conditions, as a necessary condition for a minimum of the cost function. This shows that the problem being solved is a two-point boundary value problem familiar in the discussion of variational methods. The use of continuous layers is denoted "deepest learning." These problems respect a symplectic symmetry in continuous layer phase space. Both Lagrangian versions and Hamiltonian versions of these problems are presented. Their well-studied implementation in a discrete time/layer, while respecting the symplectic structure, is addressed. The Hamiltonian version provides a direct rationale for backpropagation as a solution method for a certain two-point boundary value problem.
journal_name
Neural Computjournal_title
Neural computationauthors
Abarbanel HDI,Rozdeba PJ,Shirman Sdoi
10.1162/neco_a_01094subject
Has Abstractpub_date
2018-08-01 00:00:00pages
2025-2055issue
8eissn
0899-7667issn
1530-888Xjournal_volume
30pub_type
杂志文章abstract::We propose a new principle for replicating receptive field properties of neurons in the primary visual cortex. We derive a learning rule for a feedforward network, which maintains a low firing rate for the output neurons (resulting in temporal sparseness) and allows only a small subset of the neurons in the network to...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00341
更新日期:2012-10-01 00:00:00
abstract::Complexity of one-hidden-layer networks is studied using tools from nonlinear approximation and integration theory. For functions with suitable integral representations in the form of networks with infinitely many hidden units, upper bounds are derived on the speed of decrease of approximation error as the number of n...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2009.04-08-745
更新日期:2009-10-01 00:00:00
abstract::The hypothesis of invariant maximization of interaction (IMI) is formulated within the setting of random fields. According to this hypothesis, learning processes maximize the stochastic interaction of the neurons subject to constraints. We consider the extrinsic constraint in terms of a fixed input distribution on the...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976602760805368
更新日期:2002-12-01 00:00:00
abstract::Independent component analysis (ICA) aims at separating a multivariate signal into independent nongaussian signals by optimizing a contrast function with no knowledge on the mixing mechanism. Despite the availability of a constellation of contrast functions, a Hartley-entropy-based ICA contrast endowed with the discri...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00700
更新日期:2015-03-01 00:00:00
abstract::Cortical neurons of behaving animals generate irregular spike sequences. Recently, there has been a heated discussion about the origin of this irregularity. Softky and Koch (1993) pointed out the inability of standard single-neuron models to reproduce the irregularity of the observed spike sequences when the model par...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976699300016511
更新日期:1999-05-15 00:00:00
abstract::Natural gradient learning is known to be efficient in escaping plateau, which is a main cause of the slow learning speed of neural networks. The adaptive natural gradient learning method for practical implementation also has been developed, and its advantage in real-world problems has been confirmed. In this letter, w...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976604322742065
更新日期:2004-02-01 00:00:00
abstract::Knowledge of synaptic input is crucial for understanding synaptic integration and ultimately neural function. However, in vivo, the rates at which synaptic inputs arrive are high, so that it is typically impossible to detect single events. We show here that it is nevertheless possible to extract the properties of the ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00975
更新日期:2017-07-01 00:00:00
abstract::We considered a gamma distribution of interspike intervals as a statistical model for neuronal spike generation. A gamma distribution is a natural extension of the Poisson process taking the effect of a refractory period into account. The model is specified by two parameters: a time-dependent firing rate and a shape p...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2006.18.10.2359
更新日期:2006-10-01 00:00:00
abstract::We analyze convergence of the expectation maximization (EM) and variational Bayes EM (VBEM) schemes for parameter estimation in noisy linear models. The analysis shows that both schemes are inefficient in the low-noise limit. The linear model with additive noise includes as special cases independent component analysis...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/0899766054322991
更新日期:2005-09-01 00:00:00
abstract::Mild traumatic brain injury (mTBI) presents a significant health concern with potential persisting deficits that can last decades. Although a growing body of literature improves our understanding of the brain network response and corresponding underlying cellular alterations after injury, the effects of cellular disru...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01343
更新日期:2021-01-01 00:00:00
abstract::Particular levels of partial fault tolerance (PFT) in feedforward artificial neural networks of a given size can be obtained by redundancy (replicating a smaller normally trained network), by design (training specifically to increase PFT), and by a combination of the two (replicating a smaller PFT-trained network). Th...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/0899766053723096
更新日期:2005-07-01 00:00:00
abstract::In a recent paper, Poggio and Girosi (1990) proposed a class of neural networks obtained from the theory of regularization. Regularized networks are capable of approximating arbitrarily well any continuous function on a compactum. In this paper we consider in detail the learning problem for the one-dimensional case. W...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.1995.7.6.1225
更新日期:1995-11-01 00:00:00
abstract::The ability to encode and transmit a signal is an essential property that must demonstrate many neuronal circuits in sensory areas in addition to any processing they may provide. It is known that an appropriate level of lateral inhibition, as observed in these areas, can significantly improve the encoding ability of a...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00100
更新日期:2011-04-01 00:00:00
abstract::The visual systems of many mammals, including humans, are able to integrate the geometric information of visual stimuli and perform cognitive tasks at the first stages of the cortical processing. This is thought to be the result of a combination of mechanisms, which include feature extraction at the single cell level ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00738
更新日期:2015-06-01 00:00:00
abstract::Synaptic runaway denotes the formation of erroneous synapses and premature functional decline accompanying activity-dependent learning in neural networks. This work studies synaptic runaway both analytically and numerically in binary-firing associative memory networks. It turns out that synaptic runaway is of fairly m...
journal_title:Neural computation
pub_type: 杂志文章,评审
doi:10.1162/089976698300017836
更新日期:1998-02-15 00:00:00
abstract::Physiological signals such as neural spikes and heartbeats are discrete events in time, driven by continuous underlying systems. A recently introduced data-driven model to analyze such a system is a state-space model with point process observations, parameters of which and the underlying state sequence are simultaneou...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2010.07-09-1047
更新日期:2010-08-01 00:00:00
abstract::In this letter, we propose a noisy nonlinear version of independent component analysis (ICA). Assuming that the probability density function (p. d. f.) of sources is known, a learning rule is derived based on maximum likelihood estimation (MLE). Our model involves some algorithms of noisy linear ICA (e. g., Bermond & ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/0899766052530866
更新日期:2005-01-01 00:00:00
abstract::We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell two separate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing stra...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01060
更新日期:2018-04-01 00:00:00
abstract::The dynamic formation of groups of neurons--neuronal assemblies--is believed to mediate cognitive phenomena at many levels, but their detailed operation and mechanisms of interaction are still to be uncovered. One hypothesis suggests that synchronized oscillations underpin their formation and functioning, with a focus...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00502
更新日期:2013-11-01 00:00:00
abstract::We propose a scalable semiparametric Bayesian model to capture dependencies among multiple neurons by detecting their cofiring (possibly with some lag time) patterns over time. After discretizing time so there is at most one spike at each interval, the resulting sequence of 1s (spike) and 0s (silence) for each neuron ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00631
更新日期:2014-09-01 00:00:00
abstract::Calculation of the total conductance change induced by multiple synapses at a given membrane compartment remains one of the most time-consuming processes in biophysically realistic neural network simulations. Here we show that this calculation can be achieved in a highly efficient way even for multiply converging syna...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976698300017061
更新日期:1998-10-01 00:00:00
abstract::A hippocampal prosthesis is a very large scale integration (VLSI) biochip that needs to be implanted in the biological brain to solve a cognitive dysfunction. In this letter, we propose a novel low-complexity, small-area, and low-power programmable hippocampal neural network application-specific integrated circuit (AS...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01107
更新日期:2018-09-01 00:00:00
abstract::Although the number of artificial neural network and machine learning architectures is growing at an exponential pace, more attention needs to be paid to theoretical guarantees of asymptotic convergence for novel, nonlinear, high-dimensional adaptive learning algorithms. When properly understood, such guarantees can g...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01117
更新日期:2018-10-01 00:00:00
abstract::Based on the dopamine hypotheses of cocaine addiction and the assumption of decrement of brain reward system sensitivity after long-term drug exposure, we propose a computational model for cocaine addiction. Utilizing average reward temporal difference reinforcement learning, we incorporate the elevation of basal rewa...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2009.10-08-882
更新日期:2009-10-01 00:00:00
abstract::We present a reduction of a Hodgkin-Huxley (HH)--style bursting model to a hybridized integrate-and-fire (IF) formalism based on a thorough bifurcation analysis of the neuron's dynamics. The model incorporates HH--style equations to evolve the subthreshold currents and includes IF mechanisms to characterize spike even...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976603322518768
更新日期:2003-12-01 00:00:00
abstract::Linear-nonlinear (LN) models and their extensions have proven successful in describing transformations from stimuli to spiking responses of neurons in early stages of sensory hierarchies. Neural responses at later stages are highly nonlinear and have generally been better characterized in terms of their decoding perfo...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00890
更新日期:2016-11-01 00:00:00
abstract::Current research on discrete and rhythmic movements differs in both experimental procedures and theory, despite the ubiquitous overlap between discrete and rhythmic components in everyday behaviors. Models of rhythmic movements usually use oscillatory systems mimicking central pattern generators (CPGs). In contrast, m...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2008.03-08-720
更新日期:2009-05-01 00:00:00
abstract::We present a comprehensive framework of search methods, such as simulated annealing and batch training, for solving nonconvex optimization problems. These methods search a wider range by gradually decreasing the randomness added to the standard gradient descent method. The formulation that we define on the basis of th...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01089
更新日期:2018-07-01 00:00:00
abstract::A conductance-based model of Na+ and K+ currents underlying action potential generation is introduced by simplifying the quantitative model of Hodgkin and Huxley (HH). If the time course of rate constants can be approximated by a pulse, HH equations can be solved analytically. Pulse-based (PB) models generate action p...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.1997.9.3.503
更新日期:1997-04-01 00:00:00
abstract::One standard interpretation of networks of cortical neurons is that they form dynamical attractors. Computations such as stimulus estimation are performed by mapping inputs to points on the networks' attractive manifolds. These points represent population codes for the stimulus values. However, this standard interpret...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00051
更新日期:2010-12-01 00:00:00