Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes.

Abstract:

:We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell two separate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones. We implement the D-NTM with both continuous and discrete read and write mechanisms. We investigate the mechanisms and effects of learning to read and write into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of our model and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential [Formula: see text]MNIST, Stanford Natural Language Inference, associative recall, and copy tasks.

journal_name

Neural Comput

journal_title

Neural computation

authors

Gulcehre C,Chandar S,Cho K,Bengio Y

doi

10.1162/neco_a_01060

subject

Has Abstract

pub_date

2018-04-01 00:00:00

pages

857-884

issue

4

eissn

0899-7667

issn

1530-888X

journal_volume

30

pub_type

杂志文章
  • State-Space Representations of Deep Neural Networks.

    abstract::This letter deals with neural networks as dynamical systems governed by finite difference equations. It shows that the introduction of k -many skip connections into network architectures, such as residual networks and additive dense n...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01165

    authors: Hauser M,Gunn S,Saab S Jr,Ray A

    更新日期:2019-03-01 00:00:00

  • Active Learning for Enumerating Local Minima Based on Gaussian Process Derivatives.

    abstract::We study active learning (AL) based on gaussian processes (GPs) for efficiently enumerating all of the local minimum solutions of a black-box function. This problem is challenging because local solutions are characterized by their zero gradient and positive-definite Hessian properties, but those derivatives cannot be ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco_a_01307

    authors: Inatsu Y,Sugita D,Toyoura K,Takeuchi I

    更新日期:2020-10-01 00:00:00

  • Improving generalization performance of natural gradient learning using optimized regularization by NIC.

    abstract::Natural gradient learning is known to be efficient in escaping plateau, which is a main cause of the slow learning speed of neural networks. The adaptive natural gradient learning method for practical implementation also has been developed, and its advantage in real-world problems has been confirmed. In this letter, w...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976604322742065

    authors: Park H,Murata N,Amari S

    更新日期:2004-02-01 00:00:00

  • Sufficient dimension reduction via squared-loss mutual information estimation.

    abstract::The goal of sufficient dimension reduction in supervised learning is to find the low-dimensional subspace of input features that contains all of the information about the output values that the input features possess. In this letter, we propose a novel sufficient dimension-reduction method using a squared-loss variant...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00407

    authors: Suzuki T,Sugiyama M

    更新日期:2013-03-01 00:00:00

  • An internal model for acquisition and retention of motor learning during arm reaching.

    abstract::Humans have the ability to learn novel motor tasks while manipulating the environment. Several models of motor learning have been proposed in the literature, but few of them address the problem of retention and interference of motor memory. The modular selection and identification for control (MOSAIC) model, originall...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2009.03-08-721

    authors: Lonini L,Dipietro L,Zollo L,Guglielmelli E,Krebs HI

    更新日期:2009-07-01 00:00:00

  • The Ornstein-Uhlenbeck process does not reproduce spiking statistics of neurons in prefrontal cortex.

    abstract::Cortical neurons of behaving animals generate irregular spike sequences. Recently, there has been a heated discussion about the origin of this irregularity. Softky and Koch (1993) pointed out the inability of standard single-neuron models to reproduce the irregularity of the observed spike sequences when the model par...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976699300016511

    authors: Shinomoto S,Sakai Y,Funahashi S

    更新日期:1999-05-15 00:00:00

  • Competition between synaptic depression and facilitation in attractor neural networks.

    abstract::We study the effect of competition between short-term synaptic depression and facilitation on the dynamic properties of attractor neural networks, using Monte Carlo simulation and a mean-field analysis. Depending on the balance of depression, facilitation, and the underlying noise, the network displays different behav...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2007.19.10.2739

    authors: Torres JJ,Cortes JM,Marro J,Kappen HJ

    更新日期:2007-10-01 00:00:00

  • A neurocomputational approach to prepositional phrase attachment ambiguity resolution.

    abstract::A neurocomputational model based on emergent massively overlapping neural cell assemblies (CAs) for resolving prepositional phrase (PP) attachment ambiguity is described. PP attachment ambiguity is a well-studied task in natural language processing and is a case where semantics is used to determine the syntactic struc...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00290

    authors: Nadh K,Huyck C

    更新日期:2012-07-01 00:00:00

  • Minimizing binding errors using learned conjunctive features.

    abstract::We have studied some of the design trade-offs governing visual representations based on spatially invariant conjunctive feature detectors, with an emphasis on the susceptibility of such systems to false-positive recognition errors-Malsburg's classical binding problem. We begin by deriving an analytical model that make...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976600300015574

    authors: Mel BW,Fiser J

    更新日期:2000-04-01 00:00:00

  • Binocular receptive field models, disparity tuning, and characteristic disparity.

    abstract::Disparity tuning of visual cells in the brain depends on the structure of their binocular receptive fields (RFs). Freeman and coworkers have found that binocular RFs of a typical simple cell can be quantitatively described by two Gabor functions with the same gaussian envelope but different phase parameters in the sin...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.1996.8.8.1611

    authors: Zhu YD,Qian N

    更新日期:1996-11-15 00:00:00

  • Robustness of connectionist swimming controllers against random variation in neural connections.

    abstract::The ability to achieve high swimming speed and efficiency is very important to both the real lamprey and its robotic implementation. In previous studies, we used evolutionary algorithms to evolve biologically plausible connectionist swimming controllers for a simulated lamprey. This letter investigates the robustness ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2007.19.6.1568

    authors: Or J

    更新日期:2007-06-01 00:00:00

  • Extraction of Synaptic Input Properties in Vivo.

    abstract::Knowledge of synaptic input is crucial for understanding synaptic integration and ultimately neural function. However, in vivo, the rates at which synaptic inputs arrive are high, so that it is typically impossible to detect single events. We show here that it is nevertheless possible to extract the properties of the ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00975

    authors: Puggioni P,Jelitai M,Duguid I,van Rossum MCW

    更新日期:2017-07-01 00:00:00

  • Nonlinear Time&hyphenSeries Prediction with Missing and Noisy Data

    abstract::We derive solutions for the problem of missing and noisy data in nonlinear time&hyphenseries prediction from a probabilistic point of view. We discuss different approximations to the solutions &hyphen in particular, approximations that require either stochastic simulation or the substitution of a single estimate for t...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976698300017728

    authors: Tresp V V,Hofmann R

    更新日期:1998-03-23 00:00:00

  • Sequential Tests for Large-Scale Learning.

    abstract::We argue that when faced with big data sets, learning and inference algorithms should compute updates using only subsets of data items. We introduce algorithms that use sequential hypothesis tests to adaptively select such a subset of data points. The statistical properties of this subsampling process can be used to c...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00796

    authors: Korattikara A,Chen Y,Welling M

    更新日期:2016-01-01 00:00:00

  • Mirror symmetric topographic maps can arise from activity-dependent synaptic changes.

    abstract::Multiple adjacent, roughly mirror-image topographic maps are commonly observed in the sensory neocortex of many species. The cortical regions occupied by these maps are generally believed to be determined initially by genetically controlled chemical markers during development, with thalamocortical afferent activity su...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/0899766053491904

    authors: Schulz R,Reggia JA

    更新日期:2005-05-01 00:00:00

  • Populations of tightly coupled neurons: the RGC/LGN system.

    abstract::A mathematical model, of general character for the dynamic description of coupled neural oscillators is presented. The population approach that is employed applies equally to coupled cells as to populations of such coupled cells. The formulation includes stochasticity and preserves details of precisely firing neurons....

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2007.03-07-482

    authors: Sirovich L

    更新日期:2008-05-01 00:00:00

  • A graphical model framework for decoding in the visual ERP-based BCI speller.

    abstract::We present a graphical model framework for decoding in the visual ERP-based speller system. The proposed framework allows researchers to build generative models from which the decoding rules are obtained in a straightforward manner. We suggest two models for generating brain signals conditioned on the stimulus events....

    journal_title:Neural computation

    pub_type: 信件

    doi:10.1162/NECO_a_00066

    authors: Martens SM,Mooij JM,Hill NJ,Farquhar J,Schölkopf B

    更新日期:2011-01-01 00:00:00

  • Capturing the Forest but Missing the Trees: Microstates Inadequate for Characterizing Shorter-Scale EEG Dynamics.

    abstract::The brain is known to be active even when not performing any overt cognitive tasks, and often it engages in involuntary mind wandering. This resting state has been extensively characterized in terms of fMRI-derived brain networks. However, an alternate method has recently gained popularity: EEG microstate analysis. Pr...

    journal_title:Neural computation

    pub_type: 信件

    doi:10.1162/neco_a_01229

    authors: Shaw SB,Dhindsa K,Reilly JP,Becker S

    更新日期:2019-11-01 00:00:00

  • MISEP method for postnonlinear blind source separation.

    abstract::In this letter, a standard postnonlinear blind source separation algorithm is proposed, based on the MISEP method, which is widely used in linear and nonlinear independent component analysis. To best suit a wide class of postnonlinear mixtures, we adapt the MISEP method to incorporate a priori information of the mixtu...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2007.19.9.2557

    authors: Zheng CH,Huang DS,Li K,Irwin G,Sun ZL

    更新日期:2007-09-01 00:00:00

  • Estimating a state-space model from point process observations: a note on convergence.

    abstract::Physiological signals such as neural spikes and heartbeats are discrete events in time, driven by continuous underlying systems. A recently introduced data-driven model to analyze such a system is a state-space model with point process observations, parameters of which and the underlying state sequence are simultaneou...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2010.07-09-1047

    authors: Yuan K,Niranjan M

    更新日期:2010-08-01 00:00:00

  • Piecewise-linear neural networks and their relationship to rule extraction from data.

    abstract::This article addresses the topic of extracting logical rules from data by means of artificial neural networks. The approach based on piecewise linear neural networks is revisited, which has already been used for the extraction of Boolean rules in the past, and it is shown that this approach can be important also for t...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2006.18.11.2813

    authors: Holena M

    更新日期:2006-11-01 00:00:00

  • ParceLiNGAM: a causal ordering method robust against latent confounders.

    abstract::We consider learning a causal ordering of variables in a linear nongaussian acyclic model called LiNGAM. Several methods have been shown to consistently estimate a causal ordering assuming that all the model assumptions are correct. But the estimation results could be distorted if some assumptions are violated. In thi...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00533

    authors: Tashiro T,Shimizu S,Hyvärinen A,Washio T

    更新日期:2014-01-01 00:00:00

  • Visual Categorization with Random Projection.

    abstract::Humans learn categories of complex objects quickly and from a few examples. Random projection has been suggested as a means to learn and categorize efficiently. We investigate how random projection affects categorization by humans and by very simple neural networks on the same stimuli and categorization tasks, and how...

    journal_title:Neural computation

    pub_type: 信件

    doi:10.1162/NECO_a_00769

    authors: Arriaga RI,Rutter D,Cakmak M,Vempala SS

    更新日期:2015-10-01 00:00:00

  • Boosted mixture of experts: an ensemble learning scheme.

    abstract::We present a new supervised learning procedure for ensemble machines, in which outputs of predictors, trained on different distributions, are combined by a dynamic classifier combination model. This procedure may be viewed as either a version of mixture of experts (Jacobs, Jordan, Nowlan, & Hintnon, 1991), applied to ...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976699300016737

    authors: Avnimelech R,Intrator N

    更新日期:1999-02-15 00:00:00

  • Multistability in spiking neuron models of delayed recurrent inhibitory loops.

    abstract::We consider the effect of the effective timing of a delayed feedback on the excitatory neuron in a recurrent inhibitory loop, when biological realities of firing and absolute refractory period are incorporated into a phenomenological spiking linear or quadratic integrate-and-fire neuron model. We show that such models...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2007.19.8.2124

    authors: Ma J,Wu J

    更新日期:2007-08-01 00:00:00

  • McCulloch-Pitts Brains and Pseudorandom Functions.

    abstract::In a pioneering classic, Warren McCulloch and Walter Pitts proposed a model of the central nervous system. Motivated by EEG recordings of normal brain activity, Chvátal and Goldsmith asked whether these dynamical systems can be engineered to produce trajectories that are irregular, disorderly, and apparently unpredict...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/NECO_a_00841

    authors: Chvátal V,Goldsmith M,Yang N

    更新日期:2016-06-01 00:00:00

  • Determining Burst Firing Time Distributions from Multiple Spike Trains.

    abstract::Recent experimental findings have shown the presence of robust and cell-type-specific intraburst firing patterns in bursting neurons. We address the problem of characterizing these patterns under the assumption that the bursts exhibit well-defined firing time distributions. We propose a method for estimating these dis...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/neco.2008.07-07-571

    authors: Lago-Fernández LF,Szücs A,Varona P

    更新日期:2009-04-01 00:00:00

  • Temporal sequence learning, prediction, and control: a review of different models and their relation to biological mechanisms.

    abstract::In this review, we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward...

    journal_title:Neural computation

    pub_type: 杂志文章,评审

    doi:10.1162/0899766053011555

    authors: Wörgötter F,Porr B

    更新日期:2005-02-01 00:00:00

  • Patterns of synchrony in neural networks with spike adaptation.

    abstract::We study the emergence of synchronized burst activity in networks of neurons with spike adaptation. We show that networks of tonically firing adapting excitatory neurons can evolve to a state where the neurons burst in a synchronized manner. The mechanism leading to this burst activity is analyzed in a network of inte...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/08997660151134280

    authors: van Vreeswijk C,Hansel D

    更新日期:2001-05-01 00:00:00

  • The time-organized map algorithm: extending the self-organizing map to spatiotemporal signals.

    abstract::The new time-organized map (TOM) is presented for a better understanding of the self-organization and geometric structure of cortical signal representations. The algorithm extends the common self-organizing map (SOM) from the processing of purely spatial signals to the processing of spatiotemporal signals. The main ad...

    journal_title:Neural computation

    pub_type: 杂志文章

    doi:10.1162/089976603765202695

    authors: Wiemer JC

    更新日期:2003-05-01 00:00:00