Improved Switching Among Temporally Abstract Actions
Abstract
In robotics and other control applications it is commonplace to have a pre(cid:173) existing set of controllers for solving subtasks, perhaps hand-crafted or previously learned or planned, and still face a difficult problem of how to choose and switch among the controllers to solve an overall task as well as possible. In this paper we present a framework based on Markov decision processes and semi-Markov decision processes for phrasing this problem, a basic theorem regarding the improvement in performance that can be ob(cid:173) tained by switching flexibly between given controllers, and example appli(cid:173) cations of the theorem. In particular, we show how an agent can plan with these high-level controllers and then use the results of such planning to find an even better plan, by modifying the existing controllers, with negligible additional cost and no re-planning. In one of our examples, the complexity of the problem is reduced from 24 billion state-action pairs to less than a million state-controller pairs.
Cite
Text
Sutton et al. "Improved Switching Among Temporally Abstract Actions." Neural Information Processing Systems, 1998.Markdown
[Sutton et al. "Improved Switching Among Temporally Abstract Actions." Neural Information Processing Systems, 1998.](https://mlanthology.org/neurips/1998/sutton1998neurips-improved/)BibTeX
@inproceedings{sutton1998neurips-improved,
title = {{Improved Switching Among Temporally Abstract Actions}},
author = {Sutton, Richard S. and Singh, Satinder P. and Precup, Doina and Ravindran, Balaraman},
booktitle = {Neural Information Processing Systems},
year = {1998},
pages = {1066-1072},
url = {https://mlanthology.org/neurips/1998/sutton1998neurips-improved/}
}