A Practical Algorithm for Feature-Rich, Non-Stationary Bandit Problems

Loh, William; Sinha, Sajib Kumer; Agarwal, Ankur; Poupart, Pascal

A Practical Algorithm for Feature-Rich, Non-Stationary Bandit Problems

William Loh, Sajib Kumer Sinha, Ankur Agarwal, Pascal Poupart

TMLR 2026

/tmlr/2026/loh2026tmlr-practical/

Abstract

Contextual bandits are incredibly useful in many practical problems. We go one step further by devising a more realistic problem that combines: (1) contextual bandits with dense arm features, (2) non-linear reward functions, and (3) a generalization of correlated bandits where reward distributions change over time but the degree of correlation maintains. This formulation lends itself to a wider set of applications such as recommendation tasks. To solve this problem, we introduce *conditionally coupled contextual* ($C_3$) Thompson sampling for Bernoulli bandits. It combines an improved Nadaraya-Watson estimator on an embedding space with Thompson sampling that allows online learning without retraining. Empirical results show that $C_3$ outperforms the next best algorithm by 5.7% lower average cumulative regret on four OpenML tabular datasets as well as demonstrating a 12.4% click lift on Microsoft News Dataset (MIND) compared to other algorithms.

PDF TMLR OpenReview Code Semantic Scholar

Cite

Text

Loh et al. "A Practical Algorithm for Feature-Rich, Non-Stationary Bandit Problems." Transactions on Machine Learning Research, 2026.

Markdown

[Loh et al. "A Practical Algorithm for Feature-Rich, Non-Stationary Bandit Problems." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/loh2026tmlr-practical/)

BibTeX

@article{loh2026tmlr-practical,
  title     = {{A Practical Algorithm for Feature-Rich, Non-Stationary Bandit Problems}},
  author    = {Loh, William and Sinha, Sajib Kumer and Agarwal, Ankur and Poupart, Pascal},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/loh2026tmlr-practical/}
}