Learning Memory Mechanisms for Decision Making Through Demonstration

Abstract

In Partially Observable Markov Decision Processes, integrating an agent's history into memory poses a significant challenge for decision-making. Traditional imitation learning, relying on observation-action pairs for expert demonstrations, fails to capture the expert's memory mechanisms used in decision-making. To capture memory processes as demonstrations, we introduce the concept of **memory dependency pairs** $(p, q)$ indicating that events at time $p$ are recalled for decision-making at time $q$. We introduce **AttentionTuner** to leverage memory dependency pairs in Transformers and find significant improvements across several tasks compared to standard Transformers when evaluated on Memory Gym and the Long-term Memory Benchmark.

Cite

Text

Yue et al. "Learning Memory Mechanisms for Decision Making Through Demonstration." ICLR 2025 Workshops: NFAM, 2025.

Markdown

[Yue et al. "Learning Memory Mechanisms for Decision Making Through Demonstration." ICLR 2025 Workshops: NFAM, 2025.](https://mlanthology.org/iclrw/2025/yue2025iclrw-learning/)

BibTeX

@inproceedings{yue2025iclrw-learning,
  title     = {{Learning Memory Mechanisms for Decision Making Through Demonstration}},
  author    = {Yue, William and Liu, Bo and Stone, Peter},
  booktitle = {ICLR 2025 Workshops: NFAM},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/yue2025iclrw-learning/}
}