+ Site Statistics
+ Search Articles
+ PDF Full Text Service
How our service works
Request PDF Full Text
+ Follow Us
Follow on Facebook
Follow on Twitter
Follow on LinkedIn
+ Subscribe to Site Feeds
Most Shared
PDF Full Text
+ Translate
+ Recently Requested

An implementation of reinforcement learning based on spike timing dependent plasticity



An implementation of reinforcement learning based on spike timing dependent plasticity



Biological Cybernetics 99(6): 517-523



An explanatory model is developed to show how synaptic learning mechanisms modeled through spike-timing dependent plasticity (STDP) can result in long-term adaptations consistent with reinforcement learning models. In particular, the reinforcement learning model known as temporal difference (TD) learning has been used to model neuronal behavior in the orbitofrontal cortex (OFC) and ventral tegmental area (VTA) of macaque monkey during reinforcement learning. While some research has observed, empirically, a connection between STDP and TD, there has not been an explanatory model directly connecting TD to STDP. Through analysis of the learning dynamics that results from a general form of a STDP learning rule, the connection between STDP and TD is explained. We further demonstrate that a STDP learning rule drives the spike probability of a reward predicting neuronal population to a stable equilibrium. The equilibrium solution has an increasing slope where the steepness of the slope predicts the probability of the reward, similar to the results from electrophysiological recordings suggesting a different slope that predicts the value of the anticipated reward of Montague and Berns [Neuron 36(2):265-284, 2002]. This connection begins to shed light into more recent data gathered from VTA and OFC which are not well modeled by TD. We suggest that STDP provides the underlying mechanism for explaining reinforcement learning and other higher level perceptual and cognitive function.

Please choose payment method:






(PDF emailed within 0-6 h: $19.90)

Accession: 051491277

Download citation: RISBibTeXText

PMID: 18941775

DOI: 10.1007/s00422-008-0265-6


Related references

Reinforcement learning with modulated spike timing dependent synaptic plasticity. Journal of Neurophysiology 98(6): 3648-3665, 2007

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Computation 19(6): 1468-1502, 2007

Correlation based learning from spike timing dependent plasticity. Neurocomputing 38-40: 409-415, 2001

2D co-ordinate transformation based on a spike timing-dependent plasticity learning mechanism. Neural Networks 21(9): 1318-1327, 2008

Online Supervised Learning for Hardware-Based Multilayer Spiking Neural Networks Through the Modulation of Weight-Dependent Spike-Timing-Dependent Plasticity. IEEE Transactions on Neural Networks and Learning Systems 29(9): 4287-4302, 2018

A model of human motor sequence learning explains facilitation and interference effects based on spike-timing dependent plasticity. Plos Computational Biology 13(8): E1005632, 2017

Spike-timing-dependent plasticity: the relationship to rate-based learning for models with weight dynamics determined by a stable fixed point. Neural Computation 16(5): 885-940, 2004

Reinforcement learning, spike-time-dependent plasticity, and the BCM rule. Neural Computation 19(8): 2245-2279, 2007

Iono-neuromorphic implementation of spike-timing-dependent synaptic plasticity. Conference Proceedings 2011: 7274-7277, 2011

A calcium-based simple model of multiple spike interactions in spike-timing-dependent plasticity. Neural Computation 25(7): 1853-1869, 2013

Olfactory learning and spike timing dependent plasticity. Communicative and Integrative Biology 1(2): 170-171, 2008

Unsupervised Online Learning With Multiple Postsynaptic Neurons Based on Spike-Timing-Dependent Plasticity Using a Thin-Film Transistor-Type NOR Flash Memory Array. Journal of Nanoscience and Nanotechnology 19(10): 6050-6054, 2019

Learning complex temporal patterns with resource-dependent spike timing-dependent plasticity. Journal of Neurophysiology 108(2): 551-566, 2012

Optical spike-timing-dependent plasticity with weight-dependent learning window and reward modulation. Optics Express 23(19): 25247-25258, 2015

Spike timing-dependent plasticity: a Hebbian learning rule. Annual Review of Neuroscience 31: 25-46, 2008