Policies Modulating Trajectory Generators
Abstract
We propose an architecture for learning complex controllable behaviors by having simple Policies Modulate Trajectory Generators (PMTG), a powerful combination that can provide both memory and prior knowledge to the controller. The result is a flexible architecture that is applicable to a class of problems with periodic motion for which one has an insight into the class of trajectories that might lead to a desired behavior. We illustrate the basics of our architecture using a synthetic control problem, then go on to learn speed-controlled locomotion for a quadrupedal robot by using Deep Reinforcement Learning and Evolutionary Strategies. We demonstrate that a simple linear policy, when paired with a parametric Trajectory Generator for quadrupedal gaits, can induce walking behaviors with controllable speed from 4-dimensional IMU observations alone, and can be learned in under 1000 rollouts. We also transfer these policies to a real robot and show locomotion with controllable forward velocity.
Cite
Text
Iscen et al. "Policies Modulating Trajectory Generators." Conference on Robot Learning, 2018.Markdown
[Iscen et al. "Policies Modulating Trajectory Generators." Conference on Robot Learning, 2018.](https://mlanthology.org/corl/2018/iscen2018corl-policies/)BibTeX
@inproceedings{iscen2018corl-policies,
title = {{Policies Modulating Trajectory Generators}},
author = {Iscen, Atil and Caluwaerts, Ken and Tan, Jie and Zhang, Tingnan and Coumans, Erwin and Sindhwani, Vikas and Vanhoucke, Vincent},
booktitle = {Conference on Robot Learning},
year = {2018},
pages = {916-926},
url = {https://mlanthology.org/corl/2018/iscen2018corl-policies/}
}