Spiking neural network model of free-energy-based reinforcement learning

Nakano, Takashi; Otsuka, Makoto

doi:10.1186/1471-2202-12-S1-P244

Volume 12 Supplement 1

Twentieth Annual Computational Neuroscience Meeting: CNS*2011

Poster presentation
Open access
Published: 18 July 2011

Spiking neural network model of free-energy-based reinforcement learning

Takashi Nakano¹ &
Makoto Otsuka¹

BMC Neuroscience volume 12, Article number: P244 (2011) Cite this article

1707 Accesses
Metrics details

Reinforcement learning is a theoretical framework for learning how to act in an unknown environment through trial and errors. One reinforcement learning framework proposed by Sallans and Hinton [1], which we call free-energy-based reinforcement learning (FERL), possesses many desirable characteristics such as an ability to deal with high-dimensional sensory inputs and goal-directed representation learning, and neurally plausible characteristics such as population coding of action-value and a Hebbian learning rule modulated by reward prediction errors. These characteristics imply that FERL is possibly implemented in the brain. In order to understand the neural implementation of the reinforcement learning and pursue the neural plausibility of FERL, we implemented FERL in a more realistic spiking neural network than binary stochastic neurons.

An FERL framework uses a restricted Boltzmann machine (RBM) as a building block. The RBM is an energy-based statistical model with binary nodes separated in visible and hidden layers. In the RBM, due to its connectivity, the posterior distribution over hidden given visible nodes is statistically decoupled, yielding the simple computation of posterior distribution [2]. An RBM is implemented using a spiking neural network with leaky integrate- and-fire neurons. The network is composed of state, action, and hidden layers. The state and action layers consist of several modules (neuron groups) associated with certain states and actions. All state neurons are unidirectionally connected to all hidden neurons. Action neurons are bidirectionally connected to hidden neurons to reflect the selected action to the hidden activations. The action-values, are approximated by the negative free-energy, can be approximated by the firing of the hidden neurons. All connection weights are updated by a Hebbian learning rule and reward prediction error. The agent takes action based on the activation of action neurons.

Our spiking neural network solved reinforcement learning tasks with both low- and high-dimensional observation. All desirable characteristics in FERL framework were preserved in this extension. In both cases, the negative free-energy shows proper representation of the action-values. The free-energies estimated by the spiking neural network have high correlation with one estimated by the original RBM. Activation patterns of hidden neurons reflect the goal-oriented action-based category after reward-based learning (Figure 1).

Conclusions

Our spiking neural network implementation of FERL solves reinforcement learning tasks without losing desirable characteristics of FERL. These results suggest the FERL as a candidate of reinforcement learning rule implemented in the brain.

References

Sallans B, Hinton GE: Using Free Energies to Represent Q-values in a Multiagent Reinforcement learning Task. Advances in Neural Information Processing Systems 13. 2001
Google Scholar
Otsuka M, Yoshimoto J, Doya K: Robust population coding in free-energy-based reinforcement learning. International Conference on Artificial Neural Networks (ICANN). 2008, Part I: 377-386.
Google Scholar

Download references

Author information

Authors and Affiliations

Okinawa Institute of Science and Technology, Onna, Okinawa, 904-0412, Japan
Takashi Nakano & Makoto Otsuka

Authors

Takashi Nakano
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Otsuka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takashi Nakano.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Nakano, T., Otsuka, M. Spiking neural network model of free-energy-based reinforcement learning. BMC Neurosci 12 (Suppl 1), P244 (2011). https://doi.org/10.1186/1471-2202-12-S1-P244

Download citation

Published: 18 July 2011
DOI: https://doi.org/10.1186/1471-2202-12-S1-P244

Twentieth Annual Computational Neuroscience Meeting: CNS*2011

Spiking neural network model of free-energy-based reinforcement learning

Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Neuroscience

Contact us

Twentieth Annual Computational Neuroscience Meeting: CNS*2011

Spiking neural network model of free-energy-based reinforcement learning

Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Neuroscience

Contact us