AW-Opt: Learning Robotic Skills with Imitation andReinforcement at Scale

Abstract

Robotic skills can be learned via imitation learning (IL) using user-provided demonstrations, or via reinforcement learning (RL) using large amounts of autonomously collected experience. Both methods have complementary strengths and weaknesses: RL can reach a high level of performance, but requires exploration, which can be very time consuming and unsafe; IL does not require exploration, but only learns skills that are as good as the provided demonstrations. Can a single method combine the strengths of both approaches? A number of prior methods have aimed to address this question, proposing a variety of techniques that integrate elements of IL and RL. However, scaling up such methods to complex robotic skills that integrate diverse offline data and generalize meaningfully to real-world scenarios still presents a major challenge. In this paper, our aim is to test the scalability of prior IL + RL algorithms and devise a system based on detailed empirical experimentation that combines existing components in the most effective and scalable way. To that end, we present a series of experiments aimed at understanding the implications of each design decision, so as to develop a combined approach that can utilize demonstrations and heterogeneous prior data to attain the best performance on a range of real-world and realistic simulated robotic problems. Our complete method, which we call AW-Opt, combines elements of advantage-weighted regression and QT-Opt, providing a unified approach for integrating demonstrations and offline data for robotic manipulation. Please see https://awopt.github.io for more details.

Cite

Text

Lu et al. "AW-Opt: Learning Robotic Skills with Imitation andReinforcement at Scale." Conference on Robot Learning, 2021.

Markdown

[Lu et al. "AW-Opt: Learning Robotic Skills with Imitation andReinforcement at Scale." Conference on Robot Learning, 2021.](https://mlanthology.org/corl/2021/lu2021corl-awopt/)

BibTeX

@inproceedings{lu2021corl-awopt,
  title     = {{AW-Opt: Learning Robotic Skills with Imitation andReinforcement at Scale}},
  author    = {Lu, Yao and Hausman, Karol and Chebotar, Yevgen and Yan, Mengyuan and Jang, Eric and Herzog, Alexander and Xiao, Ted and Irpan, Alex and Khansari, Mohi and Kalashnikov, Dmitry and Levine, Sergey},
  booktitle = {Conference on Robot Learning},
  year      = {2021},
  pages     = {1078-1088},
  volume    = {164},
  url       = {https://mlanthology.org/corl/2021/lu2021corl-awopt/}
}