Efficient Algorithms for Logistic Contextual Slate Bandits with Bandit Feedback

UAI 2025 pp. 1533-1568

/uai/2025/goyal2025uai-efficient/

Abstract

We study the Logistic Contextual Slate Bandit problem, where, at each round, an agent selects a slate of $N$ items from an exponentially large set (of size $2^{\Omega(N)}$) of candidate slates provided by the environment. A single binary reward, determined by a logistic model, is observed for the chosen slate. Our objective is to develop algorithms that maximize cumulative reward over $T$ rounds while maintaining low per-round computational costs. We propose two algorithms, Slate-GLM-OFU and Slate-GLM-TS, that accomplish this goal. These algorithms achieve $N^{O(1)}$ per-round time complexity via local planning (independent slot selections), and low regret through global learning (joint parameter estimation). We provide theoretical and empirical evidence supporting these claims. Under a well-studied diversity assumption, we prove that Slate-GLM-OFU incurs only $\tilde{O}(\sqrt{T})$ regret. Extensive experiments across a wide range of synthetic settings demonstrate that our algorithms consistently outperform state-of-the-art baselines, achieving both the lowest regret and the fastest runtime. Furthermore, we apply our algorithm to select in-context examples in prompts of Language Models for solving binary classification tasks such as sentiment analysis. Our approach achieves competitive test accuracy, making it a viable alternative in practical scenarios.

PDF UAI OpenReview Semantic Scholar

Cite

Text

Goyal and Sinha. "Efficient Algorithms for Logistic Contextual Slate Bandits with Bandit Feedback." Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, 2025.

Markdown

[Goyal and Sinha. "Efficient Algorithms for Logistic Contextual Slate Bandits with Bandit Feedback." Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, 2025.](https://mlanthology.org/uai/2025/goyal2025uai-efficient/)

BibTeX

@inproceedings{goyal2025uai-efficient,
  title     = {{Efficient Algorithms for Logistic Contextual Slate Bandits with Bandit Feedback}},
  author    = {Goyal, Tanmay and Sinha, Gaurav},
  booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence},
  year      = {2025},
  pages     = {1533-1568},
  volume    = {286},
  url       = {https://mlanthology.org/uai/2025/goyal2025uai-efficient/}
}