GrASP: Gradient-Based Affordance Selection for Planning
Abstract
The ability to plan using a learned model is arguably a key component of intelligence. There are several challenges in realising such a component in large-scale reinforcement learning (RL) problems. One such challenge is dealing effectively with continuous action spaces when using tree-search planning (e.g., it is not feasible to consider every action even at just the root node of the tree). In this paper, we present a method for discovering affordances useful for planning---for learning which a small number of actions/options from a continuous space of actions/options to consider in the tree-expansion process during planning. We consider affordances that are goal-and-state-conditional mappings to actions/options as well as unconditional affordances that simply select actions/options available in all states. Our discovery method is gradient-based: we compute gradients through the planning procedure to update the parameters of the function that represents affordances. Our empirical work shows that it is indeed feasible to learn both primitive-action and option affordances in this way and that model-based RL while simultaneously learning affordances and a value-equivalent model can outperform model-free RL.
Cite
Text
Veeriah et al. "GrASP: Gradient-Based Affordance Selection for Planning." NeurIPS 2021 Workshops: DeepRL, 2021.Markdown
[Veeriah et al. "GrASP: Gradient-Based Affordance Selection for Planning." NeurIPS 2021 Workshops: DeepRL, 2021.](https://mlanthology.org/neuripsw/2021/veeriah2021neuripsw-grasp/)BibTeX
@inproceedings{veeriah2021neuripsw-grasp,
title = {{GrASP: Gradient-Based Affordance Selection for Planning}},
author = {Veeriah, Vivek and Zheng, Zeyu and Lewis, Richard and Singh, Satinder},
booktitle = {NeurIPS 2021 Workshops: DeepRL},
year = {2021},
url = {https://mlanthology.org/neuripsw/2021/veeriah2021neuripsw-grasp/}
}