Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning
Abstract
Goal-directed planning presents a challenge for classical RL algorithms due to the vastness of the combinatorial state and goal spaces, while humans and animals adapt to complex environments, especially with diverse, non-stationary objectives, often employing intermediate goals for long-horizon tasks. Here, we propose a goal reduction mechanism for effectively deriving subgoals from arbitrary and distant original goals, using a novel loop-removal technique. The product of the method, called goal-reducer, distills high-quality subgoals from a replay buffer, all without the need for prior global environmental knowledge. Simulations show that the goal-reducer can be integrated into RL frameworks like Deep Q-learning and Soft Actor-Critic. It accelerates performance in both discrete and continuous action space tasks, such as grid world navigation and robotic arm manipulation, relative to the corresponding standard RL models. Moreover, the goal-reducer, when combined with a local policy, without iterative training, outperforms its integrated deep RL counterparts in solving a navigation task. This goal reduction mechanism also models human problem-solving. Comparing the model's performance and activation with human behavior and fMRI data in a treasure hunting task, we found matching representational patterns between an goal-reducer agent's components and corresponding human brain areas, particularly the vmPFC and basal ganglia. The results suggest that humans may use a similar computational framework for goal-directed behaviors.
Cite
Text
Cheng and Brown. "Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning." Neural Information Processing Systems, 2024. doi:10.52202/079017-0104Markdown
[Cheng and Brown. "Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/cheng2024neurips-goal/) doi:10.52202/079017-0104BibTeX
@inproceedings{cheng2024neurips-goal,
title = {{Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning}},
author = {Cheng, Huzi and Brown, Joshua W.},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-0104},
url = {https://mlanthology.org/neurips/2024/cheng2024neurips-goal/}
}