Budgeted-Bandits with Controlled Restarts with Applications in Learning and Computing

Abstract

Maximizing the cumulative reward of a sequence of tasks under a time budget has been ubiquitous in many applications in computing and machine learning. Often, tasks can have random completion time and the controller needs to learn the unknown statistics while making optimal decisions. In addition to the classic exploration-exploitation trade-off, it has been shown that restarting strategy can boost the performance of the control algorithm by interrupting ongoing tasks at the expense of losing its reward. In this work, we consider a bandit setting where each decision takes a random completion time and yields a random (possibly correlated) reward at the end, both with unknown values at the time of decision. The goal of the decision-maker is to maximize the expected total reward subject to a stringent time budget $\tau$. As an additional control, we allow the decision-maker to interrupt an ongoing task and forgo its reward for a potentially more rewarding restart. Unlike previous works, we do not: assume any prior knowledge on the system statistics, or limit the action space of restarting strategies to be finite. Under this general framework, we developed efficient bandit algorithms to find optimal arms and restart strategies with $O(\log(\tau))$ and $O(\sqrt{\tau\log(\tau)})$ regret for both finite and continuous set of restart times, respectively. Furthermore, through numerical studies, we verified the applicability of our algorithm in the diverse contexts of: (i) algorithm portfolios for SAT solvers; (ii) task scheduling in wireless networks; and (iii) hyperparameter tuning in neural network training.

Cite

Text

Cayci et al. "Budgeted-Bandits with Controlled Restarts with Applications in Learning and Computing." Transactions on Machine Learning Research, 2025.

Markdown

[Cayci et al. "Budgeted-Bandits with Controlled Restarts with Applications in Learning and Computing." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/cayci2025tmlr-budgetedbandits/)

BibTeX

@article{cayci2025tmlr-budgetedbandits,
  title     = {{Budgeted-Bandits with Controlled Restarts with Applications in Learning and Computing}},
  author    = {Cayci, Semih and Zheng, Yilin and Eryilmaz, Atilla},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/cayci2025tmlr-budgetedbandits/}
}