$\texttt{PREMIER-TACO}$ Is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

Ruijie Zheng, Yongyuan Liang, Xiyao Wang, Shuang Ma, Hal Daumé Iii, Huazhe Xu, John Langford, Praveen Palanisamy, Kalyan Basu, Furong Huang

NeurIPSW 2023

/neuripsw/2023/zheng2023neuripsw-premiertaco/

Abstract

We introduce $\texttt{Premier-TACO}$, a novel multitask feature representation learning methodology aiming to enhance the efficiency of few-shot policy learning in sequential decision-making tasks. $\texttt{Premier-TACO}$ pretrains a general feature representation using a small subset of relevant multitask offline datasets, capturing essential environmental dynamics. This representation can then be fine-tuned to specific tasks with few expert demonstrations. Building upon the recent temporal action contrastive learning (TACO) objective, which obtains the state of art performance in visual control tasks, $\texttt{Premier-TACO}$ additionally employs a simple yet effective negative example sampling strategy. This key modification ensures computational efficiency and scalability for large-scale multitask offline pretraining. Experimental results from both Deepmind Control Suite and MetaWorld domains underscore the effectiveness of $\texttt{Premier-TACO}$ for pretraining visual representation, facilitating efficient few-shot imitation learning of unseen tasks. On the DeepMind Control Suite, $\texttt{Premier-TACO}$ achieves an average improvement of 101% in comparison to a carefully implemented Learn-from-scratch baseline, and a 24% improvement compared with the most effective baseline pretraining method. Similarly, on MetaWorld, $\texttt{Premier-TACO}$ obtains an average advancement of 74% against Learn-from-scratch and a 40% increase in comparison to the best baseline pretraining method.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Zheng et al. "$\texttt{PREMIER-TACO}$ Is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss." NeurIPS 2023 Workshops: FMDM, 2023.

Markdown

[Zheng et al. "$\texttt{PREMIER-TACO}$ Is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss." NeurIPS 2023 Workshops: FMDM, 2023.](https://mlanthology.org/neuripsw/2023/zheng2023neuripsw-premiertaco/)

BibTeX

@inproceedings{zheng2023neuripsw-premiertaco,
  title     = {{$\texttt{PREMIER-TACO}$ Is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss}},
  author    = {Zheng, Ruijie and Liang, Yongyuan and Wang, Xiyao and Ma, Shuang and Iii, Hal Daumé and Xu, Huazhe and Langford, John and Palanisamy, Praveen and Basu, Kalyan and Huang, Furong},
  booktitle = {NeurIPS 2023 Workshops: FMDM},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/zheng2023neuripsw-premiertaco/}
}