Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits

Abstract

In this work, we consider the data-driven assortment optimization problem under the linear multinomial logit(MNL) choice model. We first establish a improved confidence region for the maximum likelihood estimator (MLE) of the $d$-dimensional linear MNL likelihood function that removes the explicit dependency on a problem-dependent parameter $\kappa^{-1}$ in previous result (Oh and Iyengar, 2021), which scales exponentially with the radius of the parameter set. Building on the confidence region result, we investigate the data-driven assortment optimization problem in both offline and online settings. In the offline setting, the previously best-known result scales as $\tilde{O}\left(\sqrt{\frac{d}{\kappa n_{S^\star}}}\right)$, where $n_{S^\star}$ the number of times that optimal assortment $S^\star$ is observed (Dong et al., 2023). We propose a new pessimistic-based algorithm that, under a burn-in condition, removes the dependency on $d,\kappa^{-1}$ in the leading order bound and works under a more relaxed coverage condition, without requiring the exact observation of $S^\star$. In the online setting, we propose the first algorithm to achieve $\tilde{O}(\sqrt{dT})$ regret without a multiplicative dependency on $\kappa^{-1}$. In both settings, our results nearly achieve the corresponding lower bound when reduced to the canonical $N$-item MNL problem, demonstrating their optimality.

Cite

Text

Han et al. "Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits." Advances in Neural Information Processing Systems, 2025.

Markdown

[Han et al. "Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/han2025neurips-improved/)

BibTeX

@inproceedings{han2025neurips-improved,
  title     = {{Improved Confidence Regions and Optimal Algorithms for Online and Offline Linear MNL Bandits}},
  author    = {Han, Yuxuan and Blanchet, Jose and Zhou, Zhengyuan},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/han2025neurips-improved/}
}