Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks
Abstract
Meta reinforcement learning aims to develop policies that generalize to unseen tasks sampled from a task distribution. While context-based meta-RL methods improve task representation using task latents, they often struggle with out-of-distribution (OOD) tasks. To address this, we propose Task-Aware Virtual Training (TAVT), a novel algorithm that accurately captures task characteristics for both training and OOD scenarios using metric-based representation learning. Our method successfully preserves task characteristics in virtual tasks and employs a state regularization technique to mitigate overestimation errors in state-varying environments. Numerical results demonstrate that TAVT significantly enhances generalization to OOD tasks across various MuJoCo and MetaWorld environments. Our code is available at https://github.com/JM-Kim-94/tavt.git.
Cite
Text
Kim et al. "Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Kim et al. "Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/kim2025icml-taskaware/)BibTeX
@inproceedings{kim2025icml-taskaware,
title = {{Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks}},
author = {Kim, Jeongmo and Park, Yisak and Kim, Minung and Han, Seungyul},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {30720-30748},
volume = {267},
url = {https://mlanthology.org/icml/2025/kim2025icml-taskaware/}
}