Planning with Quantized Opponent Models

Abstract

Planning under opponent uncertainty is a fundamental challenge in multi-agent environments, where an agent must act while inferring the hidden policies of its opponents. Existing type-based methods rely on manually defined behavior classes and struggle to scale, while model-free approaches are sample-inefficient and lack a principled way to incorporate uncertainty into planning. We propose Quantized Opponent Models (QOM), which learn a compact catalog of opponent types via a quantized autoencoder and maintain a Bayesian belief over these types online. This posterior supports both a belief-weighted meta-policy and a Monte-Carlo planning algorithm that directly integrates uncertainty, enabling real-time belief updates and focused exploration. Experiments show that QOM achieves superior performance with lower search cost, offering a tractable and effective solution for belief-aware planning.

Cite

Text

Yu et al. "Planning with Quantized Opponent Models." Advances in Neural Information Processing Systems, 2025.

Markdown

[Yu et al. "Planning with Quantized Opponent Models." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/yu2025neurips-planning/)

BibTeX

@inproceedings{yu2025neurips-planning,
  title     = {{Planning with Quantized Opponent Models}},
  author    = {Yu, XiaoPeng and Su, Kefan and Lu, Zongqing},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/yu2025neurips-planning/}
}