Reinforcement Learning via Auxillary Task Distillation

Abstract

We present Reinforcement Learning via Auxiliary Task Distillation (AuxDistill), a new method that enables reinforcement learning (RL) to perform long-horizon robot control problems by distilling behaviors from auxiliary RL tasks. AuxDistill achieves this by concurrently carrying out multi-task RL with auxiliary tasks, which are easier to learn and relevant to the main task. A weighted distillation loss transfers behaviors from these auxiliary tasks to solve the main task. We demonstrate that AuxDistill can learn a pixels-to-actions policy for a challenging multi-stage embodied object rearrangement task from the environment reward without demonstrations, a learning curriculum, or pre-trained skills. AuxDistill achieves 2.3× higher success than the previous state-of-the-art baseline in the Habitat Object Rearrangement benchmark and outperforms methods that use pre-trained skills and expert demonstrations.

Cite

Text

Harish et al. "Reinforcement Learning via Auxillary Task Distillation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73004-7_13

Markdown

[Harish et al. "Reinforcement Learning via Auxillary Task Distillation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/harish2024eccv-reinforcement/) doi:10.1007/978-3-031-73004-7_13

BibTeX

@inproceedings{harish2024eccv-reinforcement,
  title     = {{Reinforcement Learning via Auxillary Task Distillation}},
  author    = {Harish, Abhinav N and Heck, Larry and Hanna, Josiah P and Kira, Zsolt and Szot, Andrew},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73004-7_13},
  url       = {https://mlanthology.org/eccv/2024/harish2024eccv-reinforcement/}
}