Abstract:
:We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell two separate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones. We implement the D-NTM with both continuous and discrete read and write mechanisms. We investigate the mechanisms and effects of learning to read and write into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of our model and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential [Formula: see text]MNIST, Stanford Natural Language Inference, associative recall, and copy tasks.
journal_name
Neural Computjournal_title
Neural computationauthors
Gulcehre C,Chandar S,Cho K,Bengio Ydoi
10.1162/neco_a_01060subject
Has Abstractpub_date
2018-04-01 00:00:00pages
857-884issue
4eissn
0899-7667issn
1530-888Xjournal_volume
30pub_type
杂志文章abstract::This letter deals with neural networks as dynamical systems governed by finite difference equations. It shows that the introduction of
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01165
更新日期:2019-03-01 00:00:00
abstract::We study active learning (AL) based on gaussian processes (GPs) for efficiently enumerating all of the local minimum solutions of a black-box function. This problem is challenging because local solutions are characterized by their zero gradient and positive-definite Hessian properties, but those derivatives cannot be ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco_a_01307
更新日期:2020-10-01 00:00:00
abstract::Natural gradient learning is known to be efficient in escaping plateau, which is a main cause of the slow learning speed of neural networks. The adaptive natural gradient learning method for practical implementation also has been developed, and its advantage in real-world problems has been confirmed. In this letter, w...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976604322742065
更新日期:2004-02-01 00:00:00
abstract::The goal of sufficient dimension reduction in supervised learning is to find the low-dimensional subspace of input features that contains all of the information about the output values that the input features possess. In this letter, we propose a novel sufficient dimension-reduction method using a squared-loss variant...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00407
更新日期:2013-03-01 00:00:00
abstract::Humans have the ability to learn novel motor tasks while manipulating the environment. Several models of motor learning have been proposed in the literature, but few of them address the problem of retention and interference of motor memory. The modular selection and identification for control (MOSAIC) model, originall...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2009.03-08-721
更新日期:2009-07-01 00:00:00
abstract::Cortical neurons of behaving animals generate irregular spike sequences. Recently, there has been a heated discussion about the origin of this irregularity. Softky and Koch (1993) pointed out the inability of standard single-neuron models to reproduce the irregularity of the observed spike sequences when the model par...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976699300016511
更新日期:1999-05-15 00:00:00
abstract::We study the effect of competition between short-term synaptic depression and facilitation on the dynamic properties of attractor neural networks, using Monte Carlo simulation and a mean-field analysis. Depending on the balance of depression, facilitation, and the underlying noise, the network displays different behav...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2007.19.10.2739
更新日期:2007-10-01 00:00:00
abstract::A neurocomputational model based on emergent massively overlapping neural cell assemblies (CAs) for resolving prepositional phrase (PP) attachment ambiguity is described. PP attachment ambiguity is a well-studied task in natural language processing and is a case where semantics is used to determine the syntactic struc...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00290
更新日期:2012-07-01 00:00:00
abstract::We have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors-Malsburg's classical binding problem. We begin by deriving an analytical model that make...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976600300015574
更新日期:2000-04-01 00:00:00
abstract::Disparity tuning of visual cells in the brain depends on the structure of their binocular receptive fields (RFs). Freeman and coworkers have found that binocular RFs of a typical simple cell can be quantitatively described by two Gabor functions with the same gaussian envelope but different phase parameters in the sin...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.1996.8.8.1611
更新日期:1996-11-15 00:00:00
abstract::The ability to achieve high swimming speed and efficiency is very important to both the real lamprey and its robotic implementation. In previous studies, we used evolutionary algorithms to evolve biologically plausible connectionist swimming controllers for a simulated lamprey. This letter investigates the robustness ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2007.19.6.1568
更新日期:2007-06-01 00:00:00
abstract::Knowledge of synaptic input is crucial for understanding synaptic integration and ultimately neural function. However, in vivo, the rates at which synaptic inputs arrive are high, so that it is typically impossible to detect single events. We show here that it is nevertheless possible to extract the properties of the ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00975
更新日期:2017-07-01 00:00:00
abstract::We derive solutions for the problem of missing and noisy data in nonlinear time&hyphenseries prediction from a probabilistic point of view. We discuss different approximations to the solutions &hyphen in particular, approximations that require either stochastic simulation or the substitution of a single estimate for t...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976698300017728
更新日期:1998-03-23 00:00:00
abstract::We argue that when faced with big data sets, learning and inference algorithms should compute updates using only subsets of data items. We introduce algorithms that use sequential hypothesis tests to adaptively select such a subset of data points. The statistical properties of this subsampling process can be used to c...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00796
更新日期:2016-01-01 00:00:00
abstract::Multiple adjacent, roughly mirror-image topographic maps are commonly observed in the sensory neocortex of many species. The cortical regions occupied by these maps are generally believed to be determined initially by genetically controlled chemical markers during development, with thalamocortical afferent activity su...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/0899766053491904
更新日期:2005-05-01 00:00:00
abstract::A mathematical model, of general character for the dynamic description of coupled neural oscillators is presented. The population approach that is employed applies equally to coupled cells as to populations of such coupled cells. The formulation includes stochasticity and preserves details of precisely firing neurons....
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2007.03-07-482
更新日期:2008-05-01 00:00:00
abstract::We present a graphical model framework for decoding in the visual ERP-based speller system. The proposed framework allows researchers to build generative models from which the decoding rules are obtained in a straightforward manner. We suggest two models for generating brain signals conditioned on the stimulus events....
journal_title:Neural computation
pub_type: 信件
doi:10.1162/NECO_a_00066
更新日期:2011-01-01 00:00:00
abstract::The brain is known to be active even when not performing any overt cognitive tasks, and often it engages in involuntary mind wandering. This resting state has been extensively characterized in terms of fMRI-derived brain networks. However, an alternate method has recently gained popularity: EEG microstate analysis. Pr...
journal_title:Neural computation
pub_type: 信件
doi:10.1162/neco_a_01229
更新日期:2019-11-01 00:00:00
abstract::In this letter, a standard postnonlinear blind source separation algorithm is proposed, based on the MISEP method, which is widely used in linear and nonlinear independent component analysis. To best suit a wide class of postnonlinear mixtures, we adapt the MISEP method to incorporate a priori information of the mixtu...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2007.19.9.2557
更新日期:2007-09-01 00:00:00
abstract::Physiological signals such as neural spikes and heartbeats are discrete events in time, driven by continuous underlying systems. A recently introduced data-driven model to analyze such a system is a state-space model with point process observations, parameters of which and the underlying state sequence are simultaneou...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2010.07-09-1047
更新日期:2010-08-01 00:00:00
abstract::This article addresses the topic of extracting logical rules from data by means of artificial neural networks. The approach based on piecewise linear neural networks is revisited, which has already been used for the extraction of Boolean rules in the past, and it is shown that this approach can be important also for t...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2006.18.11.2813
更新日期:2006-11-01 00:00:00
abstract::We consider learning a causal ordering of variables in a linear nongaussian acyclic model called LiNGAM. Several methods have been shown to consistently estimate a causal ordering assuming that all the model assumptions are correct. But the estimation results could be distorted if some assumptions are violated. In thi...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00533
更新日期:2014-01-01 00:00:00
abstract::Humans learn categories of complex objects quickly and from a few examples. Random projection has been suggested as a means to learn and categorize efficiently. We investigate how random projection affects categorization by humans and by very simple neural networks on the same stimuli and categorization tasks, and how...
journal_title:Neural computation
pub_type: 信件
doi:10.1162/NECO_a_00769
更新日期:2015-10-01 00:00:00
abstract::We present a new supervised learning procedure for ensemble machines, in which outputs of predictors, trained on different distributions, are combined by a dynamic classifier combination model. This procedure may be viewed as either a version of mixture of experts (Jacobs, Jordan, Nowlan, & Hintnon, 1991), applied to ...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976699300016737
更新日期:1999-02-15 00:00:00
abstract::We consider the effect of the effective timing of a delayed feedback on the excitatory neuron in a recurrent inhibitory loop, when biological realities of firing and absolute refractory period are incorporated into a phenomenological spiking linear or quadratic integrate-and-fire neuron model. We show that such models...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2007.19.8.2124
更新日期:2007-08-01 00:00:00
abstract::In a pioneering classic, Warren McCulloch and Walter Pitts proposed a model of the central nervous system. Motivated by EEG recordings of normal brain activity, Chvátal and Goldsmith asked whether these dynamical systems can be engineered to produce trajectories that are irregular, disorderly, and apparently unpredict...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/NECO_a_00841
更新日期:2016-06-01 00:00:00
abstract::Recent experimental findings have shown the presence of robust and cell-type-specific intraburst firing patterns in bursting neurons. We address the problem of characterizing these patterns under the assumption that the bursts exhibit well-defined firing time distributions. We propose a method for estimating these dis...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/neco.2008.07-07-571
更新日期:2009-04-01 00:00:00
abstract::In this review, we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward...
journal_title:Neural computation
pub_type: 杂志文章,评审
doi:10.1162/0899766053011555
更新日期:2005-02-01 00:00:00
abstract::We study the emergence of synchronized burst activity in networks of neurons with spike adaptation. We show that networks of tonically firing adapting excitatory neurons can evolve to a state where the neurons burst in a synchronized manner. The mechanism leading to this burst activity is analyzed in a network of inte...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/08997660151134280
更新日期:2001-05-01 00:00:00
abstract::The new time-organized map (TOM) is presented for a better understanding of the self-organization and geometric structure of cortical signal representations. The algorithm extends the common self-organizing map (SOM) from the processing of purely spatial signals to the processing of spatiotemporal signals. The main ad...
journal_title:Neural computation
pub_type: 杂志文章
doi:10.1162/089976603765202695
更新日期:2003-05-01 00:00:00