Self-Efficacy Update in Reinforcement Learning: Impact on Goal Selection for Q-Learning Agents

Abstract

We introduce a dynamic self-efficacy learning rule and examine its impact on multi-goal selection in a grid-world. We model the Q-learning agent's self-efficacy as the integral of reward prediction errors (RPEs), allowing it to modulate the agent's expectation of achieving the best possible future outcome. Initial simulation results suggest that faster self-efficacy updates lead to higher overall reward accumulation, but with increased variability in reaching the optimal goal. These findings indicate that an optimal self-efficacy update rate, which can be learned through experience, may strike a balance between maximizing performance and maintaining stability.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Li and Radulescu. "Self-Efficacy Update in Reinforcement Learning: Impact on Goal Selection for Q-Learning Agents." NeurIPS 2024 Workshops: IMOL, 2024.

Markdown

[Li and Radulescu. "Self-Efficacy Update in Reinforcement Learning: Impact on Goal Selection for Q-Learning Agents." NeurIPS 2024 Workshops: IMOL, 2024.](https://mlanthology.org/neuripsw/2024/li2024neuripsw-selfefficacy/)

BibTeX

@inproceedings{li2024neuripsw-selfefficacy,
  title     = {{Self-Efficacy Update in Reinforcement Learning: Impact on Goal Selection for Q-Learning Agents}},
  author    = {Li, Jing and Radulescu, Angela},
  booktitle = {NeurIPS 2024 Workshops: IMOL},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/li2024neuripsw-selfefficacy/}
}