On the problem in model selection of neural network regression in overrealizable scenario.

Abstract:

:In considering a statistical model selection of neural networks and radial basis functions under an overrealizable case, the problem of unidentifiability emerges. Because the model selection criterion is an unbiased estimator of the generalization error based on the training error, this article analyzes the expected training error and the expected generalization error of neural networks and radial basis functions in overrealizable cases and clarifies the difference from regular models, for which identifiability holds. As a special case of an overrealizable scenario, we assumed a gaussian noise sequence as training data. In the least-squares estimation under this assumption, we first formulated the problem, in which the calculation of the expected errors of unidentifiable networks is reduced to the calculation of the expectation of the supremum of the chi2 process. Under this formulation, we gave an upper bound of the expected training error and a lower bound of the expected generalization error, where the generalization is measured at a set of training inputs. Furthermore, we gave stochastic bounds on the training error and the generalization error. The obtained upper bound of the expected training error is smaller than in regular models, and the lower bound of the expected generalization error is larger than in regular models. The result tells us that the degree of overfitting in neural networks and radial basis functions is higher than in regular models. Correspondingly, it also tells us that the generalization capability is worse than in the case of regular models. The article may be enough to show a difference between neural networks and regular models in the context of the least-squares estimation in a simple situation. This is a first step in constructing a model selection criterion in an overrealizable case. Further important problems in this direction are also included in this article.

journal_name

Neural Comput

journal_title

Neural computation

authors

Hagiwara K

doi

10.1162/089976602760128090

subject

Has Abstract

pub_date

2002-08-01 00:00:00

pages

1979-2002

issue

8

eissn

0899-7667

issn

1530-888X

journal_volume

14

pub_type

杂志文章
  • Computing with self-excitatory cliques: A model and an application to hyperacuity-scale computation in visual cortex.

    abstract::We present a model of visual computation based on tightly inter-connected cliques of pyramidal cells. It leads to a formal theory of cell assemblies, a specific relationship between correlated firing patterns and abstract functionality, and a direct calculation relating estimates of cortical cell counts to orientation...

    journal_title:Neural computation

    pub_type: 杂志文章,评审

    doi:10.1162/089976699300016782

    authors: Miller DA,Zucker SW

    更新日期:1999-01-01 00:00:00

  • Multiple model-based reinforcement learning.

    abstract::We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The sys...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976602753712972

    authors: Doya K,Samejima K,Katagiri K,Kawato M

    更新日期:2002-06-01 00:00:00

  • Correlational Neural Networks.

    abstract::Common representation learning (CRL), wherein different descriptions (or views) of the data are embedded in a common subspace, has been receiving a lot of attention recently. Two popular paradigms here are canonical correlation analysis (CCA)-based approaches and autoencoder (AE)-based approaches. CCA-based approaches...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00801

    authors: Chandar S,Khapra MM,Larochelle H,Ravindran B

    更新日期:2016-02-01 00:00:00

  • Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning.

    abstract::Most conventional policy gradient reinforcement learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the policy parameter. That term involves the derivative of the stationary state distribution that corresponds to the sensitivity of its distributio...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2009.12-08-922

    authors: Morimura T,Uchibe E,Yoshimoto J,Peters J,Doya K

    更新日期:2010-02-01 00:00:00

  • Active Learning for Enumerating Local Minima Based on Gaussian Process Derivatives.

    abstract::We study active learning (AL) based on gaussian processes (GPs) for efficiently enumerating all of the local minimum solutions of a black-box function. This problem is challenging because local solutions are characterized by their zero gradient and positive-definite Hessian properties, but those derivatives cannot be ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01307

    authors: Inatsu Y,Sugita D,Toyoura K,Takeuchi I

    更新日期:2020-10-01 00:00:00

  • A Resource-Allocating Network for Function Interpolation.

    abstract::We have created a network that allocates a new computational unit whenever an unusual pattern is presented to the network. This network forms compact representations, yet learns easily and rapidly. The network can be used at any time in the learning process and the learning patterns do not have to be repeated. The uni...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.1991.3.2.213

    authors: Platt J

    更新日期:1991-07-01 00:00:00

  • Accelerated spike resampling for accurate multiple testing controls.

    abstract::Controlling for multiple hypothesis tests using standard spike resampling techniques often requires prohibitive amounts of computation. Importance sampling techniques can be used to accelerate the computation. The general theory is presented, along with specific examples for testing differences across conditions using...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00399

    authors: Harrison MT

    更新日期:2013-02-01 00:00:00

  • The effects of input rate and synchrony on a coincidence detector: analytical solution.

    abstract::We derive analytically the solution for the output rate of the ideal coincidence detector. The solution is for an arbitrary number of input spike trains with identical binomial count distributions (which includes Poisson statistics as a special case) and identical arbitrary pairwise cross-correlations, from zero corre...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976603321192068

    authors: Mikula S,Niebur E

    更新日期:2003-03-01 00:00:00

  • Solution methods for a new class of simple model neurons.

    abstract::Izhikevich (2003) proposed a new canonical neuron model of spike generation. The model was surprisingly simple yet able to accurately replicate the firing patterns of different types of cortical cell. Here, we derive a solution method that allows efficient simulation of the model. ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2007.19.12.3216

    authors: Humphries MD,Gurney K

    更新日期:2007-12-01 00:00:00

  • Learning only when necessary: better memories of correlated patterns in networks with bounded synapses.

    abstract::Learning in a neuronal network is often thought of as a linear superposition of synaptic modifications induced by individual stimuli. However, since biological synapses are naturally bounded, a linear superposition would cause fast forgetting of previously acquired memories. Here we show that this forgetting can be av...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/0899766054615644

    authors: Senn W,Fusi S

    更新日期:2005-10-01 00:00:00

  • Estimating a state-space model from point process observations: a note on convergence.

    abstract::Physiological signals such as neural spikes and heartbeats are discrete events in time, driven by continuous underlying systems. A recently introduced data-driven model to analyze such a system is a state-space model with point process observations, parameters of which and the underlying state sequence are simultaneou...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2010.07-09-1047

    authors: Yuan K,Niranjan M

    更新日期:2010-08-01 00:00:00

  • A Distributed Framework for the Construction of Transport Maps.

    abstract::The need to reason about uncertainty in large, complex, and multimodal data sets has become increasingly common across modern scientific environments. The ability to transform samples from one distribution P to another distribution

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01172

    authors: Mesa DA,Tantiongloc J,Mendoza M,Kim S,P Coleman T

    更新日期:2019-04-01 00:00:00

  • Synchrony and desynchrony in integrate-and-fire oscillators.

    abstract::Due to many experimental reports of synchronous neural activity in the brain, there is much interest in understanding synchronization in networks of neural oscillators and its potential for computing perceptual organization. Contrary to Hopfield and Herz (1995), we find that networks of locally coupled integrate-and-f...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976699300016160

    authors: Campbell SR,Wang DL,Jayaprakash C

    更新日期:1999-10-01 00:00:00

  • Mismatched training and test distributions can outperform matched ones.

    abstract::In learning theory, the training and test sets are assumed to be drawn from the same probability distribution. This assumption is also followed in practical situations, where matching the training and test distributions is considered desirable. Contrary to conventional wisdom, we show that mismatched training and test...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00697

    authors: González CR,Abu-Mostafa YS

    更新日期:2015-02-01 00:00:00

  • Bayesian active learning of neural firing rate maps with transformed gaussian process priors.

    abstract::A firing rate map, also known as a tuning curve, describes the nonlinear relationship between a neuron's spike rate and a low-dimensional stimulus (e.g., orientation, head direction, contrast, color). Here we investigate Bayesian active learning methods for estimating firing rate maps in closed-loop neurophysiology ex...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00615

    authors: Park M,Weller JP,Horwitz GD,Pillow JW

    更新日期:2014-08-01 00:00:00

  • Boosted mixture of experts: an ensemble learning scheme.

    abstract::We present a new supervised learning procedure for ensemble machines, in which outputs of predictors, trained on different distributions, are combined by a dynamic classifier combination model. This procedure may be viewed as either a version of mixture of experts (Jacobs, Jordan, Nowlan, & Hintnon, 1991), applied to ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976699300016737

    authors: Avnimelech R,Intrator N

    更新日期:1999-02-15 00:00:00

  • A Gaussian attractor network for memory and recognition with experience-dependent learning.

    abstract::Attractor networks are widely believed to underlie the memory systems of animals across different species. Existing models have succeeded in qualitatively modeling properties of attractor dynamics, but their computational abilities often suffer from poor representations for realistic complex patterns, spurious attract...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2010.02-09-957

    authors: Hu X,Zhang B

    更新日期:2010-05-01 00:00:00

  • Normalization enables robust validation of disparity estimates from neural populations.

    abstract::Binocular fusion takes place over a limited region smaller than one degree of visual angle (Panum's fusional area), which is on the order of the range of preferred disparities measured in populations of disparity-tuned neurons in the visual cortex. However, the actual range of binocular disparities encountered in natu...

    journal_title:Neural computation

    pub_type: 信件

    doi:10.1162/neco.2008.05-07-532

    authors: Tsang EK,Shi BE

    更新日期:2008-10-01 00:00:00

  • Feature selection in simple neurons: how coding depends on spiking dynamics.

    abstract::The relationship between a neuron's complex inputs and its spiking output defines the neuron's coding strategy. This is frequently and effectively modeled phenomenologically by one or more linear filters that extract the components of the stimulus that are relevant for triggering spikes and a nonlinear function that r...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2009.02-09-956

    authors: Famulare M,Fairhall A

    更新日期:2010-03-01 00:00:00

  • Why Does Large Batch Training Result in Poor Generalization? A Comprehensive Explanation and a Better Strategy from the Viewpoint of Stochastic Optimization.

    abstract::We present a comprehensive framework of search methods, such as simulated annealing and batch training, for solving nonconvex optimization problems. These methods search a wider range by gradually decreasing the randomness added to the standard gradient descent method. The formulation that we define on the basis of th...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01089

    authors: Takase T,Oyama S,Kurihara M

    更新日期:2018-07-01 00:00:00

  • Delay Differential Analysis of Seizures in Multichannel Electrocorticography Data.

    abstract::High-density electrocorticogram (ECoG) electrodes are capable of recording neurophysiological data with high temporal resolution with wide spatial coverage. These recordings are a window to understanding how the human brain processes information and subsequently behaves in healthy and pathologic states. Here, we descr...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01009

    authors: Lainscsek C,Weyhenmeyer J,Cash SS,Sejnowski TJ

    更新日期:2017-12-01 00:00:00

  • Change-based inference in attractor nets: linear analysis.

    abstract::One standard interpretation of networks of cortical neurons is that they form dynamical attractors. Computations such as stimulus estimation are performed by mapping inputs to points on the networks' attractive manifolds. These points represent population codes for the stimulus values. However, this standard interpret...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00051

    authors: Moazzezi R,Dayan P

    更新日期:2010-12-01 00:00:00

  • Conditional density estimation with dimensionality reduction via squared-loss conditional entropy minimization.

    abstract::Regression aims at estimating the conditional mean of output given input. However, regression is not informative enough if the conditional density is multimodal, heteroskedastic, and asymmetric. In such a case, estimating the conditional density itself is preferable, but conditional density estimation (CDE) is challen...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00683

    authors: Tangkaratt V,Xie N,Sugiyama M

    更新日期:2015-01-01 00:00:00

  • Statistical computer model analysis of the reciprocal and recurrent inhibitions of the Ia-EPSP in α-motoneurons.

    abstract::We simulate the inhibition of Ia-glutamatergic excitatory postsynaptic potential (EPSP) by preceding it with glycinergic recurrent (REN) and reciprocal (REC) inhibitory postsynaptic potentials (IPSPs). The inhibition is evaluated in the presence of voltage-dependent conductances of sodium, delayed rectifier potassium,...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00375

    authors: Gradwohl G,Grossman Y

    更新日期:2013-01-01 00:00:00

  • Neural Quadratic Discriminant Analysis: Nonlinear Decoding with V1-Like Computation.

    abstract::Linear-nonlinear (LN) models and their extensions have proven successful in describing transformations from stimuli to spiking responses of neurons in early stages of sensory hierarchies. Neural responses at later stages are highly nonlinear and have generally been better characterized in terms of their decoding perfo...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00890

    authors: Pagan M,Simoncelli EP,Rust NC

    更新日期:2016-11-01 00:00:00

  • Binocular receptive field models, disparity tuning, and characteristic disparity.

    abstract::Disparity tuning of visual cells in the brain depends on the structure of their binocular receptive fields (RFs). Freeman and coworkers have found that binocular RFs of a typical simple cell can be quantitatively described by two Gabor functions with the same gaussian envelope but different phase parameters in the sin...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.1996.8.8.1611

    authors: Zhu YD,Qian N

    更新日期:1996-11-15 00:00:00

  • A neural-network-based approach to the double traveling salesman problem.

    abstract::The double traveling salesman problem is a variation of the basic traveling salesman problem where targets can be reached by two salespersons operating in parallel. The real problem addressed by this work concerns the optimization of the harvest sequence for the two independent arms of a fruit-harvesting robot. This a...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/08997660252741194

    authors: Plebe A,Anile AM

    更新日期:2002-02-01 00:00:00

  • Convergence of the IRWLS Procedure to the Support Vector Machine Solution.

    abstract::An iterative reweighted least squares (IRWLS) procedure recently proposed is shown to converge to the support vector machine solution. The convergence to a stationary point is ensured by modifying the original IRWLS procedure. ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/0899766052530875

    authors: Pérez-Cruz F,Bousoño-Calzón C,Artés-Rodríguez A

    更新日期:2005-01-01 00:00:00

  • Discriminant component pruning. Regularization and interpretation of multi-layered back-propagation networks.

    abstract::Neural networks are often employed as tools in classification tasks. The use of large networks increases the likelihood of the task's being learned, although it may also lead to increased complexity. Pruning is an effective way of reducing the complexity of large networks. We present discriminant components pruning (D...

    journal_title:Neural computation

    pub_type: 杂志文章,评审

    doi:10.1162/089976699300016665

    authors: Koene RA,Takane Y

    更新日期:1999-04-01 00:00:00

  • An extended analytic expression for the membrane potential distribution of conductance-based synaptic noise.

    abstract::Synaptically generated subthreshold membrane potential (Vm) fluctuations can be characterized within the framework of stochastic calculus. It is possible to obtain analytic expressions for the steady-state Vm distribution, even in the case of conductance-based synaptic currents. However, as we show here, the analytic ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/0899766054796932

    authors: Rudolph M,Destexhe A

    更新日期:2005-11-01 00:00:00