MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
Abstract
Significant advances have been made in developing general-purpose embodied AI in environments like Minecraft through the adoption of LLM-augmented hierarchical approaches. While these approaches, which combine high-level planners with low-level controllers, show promise, low-level controllers frequently become performance bottlenecks due to repeated failures. In this paper, we argue that the primary cause of failure in many low-level controllers is the absence of an episodic memory system. To address this, we introduce MrSteve (Memory Recall Steve), a novel low-level controller equipped with Place Event Memory (PEM), a form of episodic memory that captures what, where, and when information from episodes. This directly addresses the main limitation of the popular low-level controller, Steve-1. Unlike previous models that rely on short-term memory, PEM organizes spatial and event-based data, enabling efficient recall and navigation in long-horizon tasks. Additionally, we propose an Exploration Strategy and a Memory-Augmented Task Solving Framework, allowing agents to alternate between exploration and task-solving based on recalled events. Our approach significantly improves task-solving and exploration efficiency compared to existing methods. We will release our code and demos on the project page: https://sites.google.com/view/mr-steve.
Cite
Text
Park et al. "MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory." International Conference on Learning Representations, 2025.Markdown
[Park et al. "MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/park2025iclr-mrsteve/)BibTeX
@inproceedings{park2025iclr-mrsteve,
title = {{MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory}},
author = {Park, Junyeong and Cho, Junmo and Ahn, Sungjin},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/park2025iclr-mrsteve/}
}