Dopamine Bonuses

Kakade, Sham; Dayan, Peter

Dopamine Bonuses

NeurIPS 2000 pp. 131-137

/neurips/2000/kakade2000neurips-dopamine/

Abstract

Substantial data support a temporal difference (TO) model of dopamine (OA) neuron activity in which the cells provide a global error signal for reinforcement learning. However, in certain cir(cid:173) cumstances, OA activity seems anomalous under the TO model, responding to non-rewarding stimuli. We address these anoma(cid:173) lies by suggesting that OA cells multiplex information about re(cid:173) ward bonuses, including Sutton's exploration bonuses and Ng et al's non-distorting shaping bonuses. We interpret this additional role for OA in terms of the unconditional attentional and psy(cid:173) chomotor effects of dopamine, having the computational role of guiding exploration.

PDF NeurIPS Semantic Scholar

Cite

Text

Kakade and Dayan. "Dopamine Bonuses." Neural Information Processing Systems, 2000.

Markdown

[Kakade and Dayan. "Dopamine Bonuses." Neural Information Processing Systems, 2000.](https://mlanthology.org/neurips/2000/kakade2000neurips-dopamine/)

BibTeX

@inproceedings{kakade2000neurips-dopamine,
  title     = {{Dopamine Bonuses}},
  author    = {Kakade, Sham and Dayan, Peter},
  booktitle = {Neural Information Processing Systems},
  year      = {2000},
  pages     = {131-137},
  url       = {https://mlanthology.org/neurips/2000/kakade2000neurips-dopamine/}
}