Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
Abstract
Variance-dependent regret bounds have received increasing attention in recent studies on contextual bandits. However, most of these studies are focused on upper confidence bound (UCB)-based bandit algorithms, while sampling based bandit algorithms such as Thompson sampling are still understudied. The only exception is the `LinVDTS` algorithm (Xu et al., 2023), which is limited to linear reward function and its regret bound is not optimal with respect to the model dimension. In this paper, we present `FGTSVA`, a variance-aware Thompson Sampling algorithm for contextual bandits with general reward function with optimal regret bound. At the core of our analysis is an extension of the decoupling coefficient, a technique commonly used in the analysis of Feel-good Thompson sampling (FGTS) that reflects the complexity of the model space. With the new decoupling coefficient denoted by $\mathrm{dc}$, `FGTS-VA` achieves the regret of $\tilde{\mathcal{O}}(\sqrt{\mathrm{dc}\cdot\log|\mathcal{F}|\sum_{t=1}^T\sigma_t^2}+\mathrm{dc})$, where $|\mathcal{F}|$ is the size of the model space, $T$ is the total number of rounds, and $\sigma_t^2$ is the subgaussian norm of the noise (e.g., variance when the noise is Gaussian) at round $t$. In the setting of contextual linear bandits, the regret bound of `FGTSVA` matches that of UCB-based algorithms using weighted linear regression (Zhou and Gu, 2022).
Cite
Text
Li and Gu. "Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits." Advances in Neural Information Processing Systems, 2025.Markdown
[Li and Gu. "Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/li2025neurips-varianceaware/)BibTeX
@inproceedings{li2025neurips-varianceaware,
title = {{Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits}},
author = {Li, Xuheng and Gu, Quanquan},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/li2025neurips-varianceaware/}
}