Email updates

Keep up to date with the latest news and content from BMC Neuroscience and BioMed Central.

This article is part of the supplement: Eighteenth Annual Computational Neuroscience Meeting: CNS*2009

Open Access Poster presentation

Controlling neuronal fluctuations for directed exploration during reinforcement learning

Orlando Areval* and Klaus Pawelzik

Author Affiliations

Institute for Theoretical Physics, University Bremen, Bremen, Germany

For all author emails, please log on.

BMC Neuroscience 2009, 10(Suppl 1):P138  doi:10.1186/1471-2202-10-S1-P138


The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2202/10/S1/P138


Published:13 July 2009

© 2009 Areval and Pawelzik; licensee BioMed Central Ltd.

Introduction

Neuronal and synaptic fluctuations have both been proposed to underly reward controlled learning [1,2] and have been used to explain song learning in songbird area RA [3]. The songbird area LMAN provides perturbations to area RA that are necessary for learning [4], suggesting that LMAN might target specific subsets of RA neurons and control the corresponding noise level for directed experimentation. Here we explore this hypothesis by investigating algorithms for controlling the amount of noise in order to yield efficient reinforcement learning in large networks. Our research is guided by previous work on exploration for learning which exploits information gain [5]. We find that noise control can strongly increase learning efficiency thereby attenuating the curse of dimensionality. Our results suggest that area LMAN controls experimentation by targeted control and injection of noise into RA, which might have testable implications also for learning in other motor pathways.

References

  1. Xie X, Seung HS: Learning in neural networks by reinforcement of irregular spiking.

    Physical Review E 69 2004, 041909. Publisher Full Text OpenURL

  2. Seung HS: Learning in spiking neural networks by reinforcement of stochastic synaptic transmission.

    Neuron 2003, 40:1063-1073. PubMed Abstract | Publisher Full Text OpenURL

  3. Fiete I, Fee M, Seung HS: Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances.

    Journal of Neurophysiology 2007, 98:2038-2057. PubMed Abstract | Publisher Full Text OpenURL

  4. Ölveczky B, Andalman A, Fee M: Vocal experimentation in the juvenile songbird requires a basal ganglia circuit.

    PLoS Biol 2005, 3:e153. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  5. Si B, Pawelzik K: Robot exploration by subjectively maximizing objective information gain.

    Robotics and Biometrics IEEE International Conference 2004, 930-935. OpenURL