Tackling AlfWorld with Action Attention and Common Sense from Language Models
Abstract
Pre-trained language models (LMs) capture strong prior knowledge about the world. This common sense knowledge can be used in control tasks. However, directly generating actions from LMs may result in a reasonable narrative, but not executable by a low level agent. We propose to instead use the knowledge in LMs to simplify the control problem, and assist the low-level actor training. We implement a novel question answering framework to simplify observations and an agent that handles arbitrary roll-out length and action space size based on action attention. On the Alfworld benchmark for indoor instruction following, we achieve a significantly higher success rate (50% over the baseline) with our novel object masking - action attention method.
Cite
Text
Wu et al. "Tackling AlfWorld with Action Attention and Common Sense from Language Models." NeurIPS 2022 Workshops: LaReL, 2022.Markdown
[Wu et al. "Tackling AlfWorld with Action Attention and Common Sense from Language Models." NeurIPS 2022 Workshops: LaReL, 2022.](https://mlanthology.org/neuripsw/2022/wu2022neuripsw-tackling/)BibTeX
@inproceedings{wu2022neuripsw-tackling,
title = {{Tackling AlfWorld with Action Attention and Common Sense from Language Models}},
author = {Wu, Yue and Min, So Yeon and Bisk, Yonatan and Salakhutdinov, Ruslan and Prabhumoye, Shrimai},
booktitle = {NeurIPS 2022 Workshops: LaReL},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/wu2022neuripsw-tackling/}
}