Planning with Quantized Opponent Models
Abstract
Planning under opponent uncertainty is a fundamental challenge in multi-agent environments, where an agent must act while inferring the hidden policies of its opponents. Existing type-based methods rely on manually defined behavior classes and struggle to scale, while model-free approaches are sample-inefficient and lack a principled way to incorporate uncertainty into planning. We propose Quantized Opponent Models (QOM), which learn a compact catalog of opponent types via a quantized autoencoder and maintain a Bayesian belief over these types online. This posterior supports both a belief-weighted meta-policy and a Monte-Carlo planning algorithm that directly integrates uncertainty, enabling real-time belief updates and focused exploration. Experiments show that QOM achieves superior performance with lower search cost, offering a tractable and effective solution for belief-aware planning.
Cite
Text
Yu et al. "Planning with Quantized Opponent Models." Advances in Neural Information Processing Systems, 2025.Markdown
[Yu et al. "Planning with Quantized Opponent Models." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/yu2025neurips-planning/)BibTeX
@inproceedings{yu2025neurips-planning,
title = {{Planning with Quantized Opponent Models}},
author = {Yu, XiaoPeng and Su, Kefan and Lu, Zongqing},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/yu2025neurips-planning/}
}