Budget Allocation Using Weakly Coupled, Constrained Markov Decision Processes

Abstract

We consider the problem of budget (or other resource) allocation in sequential decision problems involving a large number of concurrently running sub-processes, whose only interaction is through their consumption of budget. Our first contribution is the introduction of budgeted MDPs (BMDPs), an MDP model in which policies/values are a function of available budget, (c.f. constrained MDPs which are solved given a fixed budget). BMDPs allow one to explicitly trade off allocated budget and expected value. We show that optimal value functions are concave, non-decreasing in budget, and piecewise-linear in the finite horizon case, and can be computed by dynamic programming (and support ready approximation). Our second contribution is a method that exploits BMDP solutions to allocate budget to a large number of independent BMDPs, coupled only by their common budget pool. The problem can be cast as a multiple-choice knapsack problem, which admits an efficient, optimal greedy algorithm. Empirical results in an online advertising domain confirm the efficacy of our methods.

Cite

Text

Boutilier and Lu. "Budget Allocation Using Weakly Coupled, Constrained Markov Decision Processes." Conference on Uncertainty in Artificial Intelligence, 2016.

Markdown

[Boutilier and Lu. "Budget Allocation Using Weakly Coupled, Constrained Markov Decision Processes." Conference on Uncertainty in Artificial Intelligence, 2016.](https://mlanthology.org/uai/2016/boutilier2016uai-budget/)

BibTeX

@inproceedings{boutilier2016uai-budget,
  title     = {{Budget Allocation Using Weakly Coupled, Constrained Markov Decision Processes}},
  author    = {Boutilier, Craig and Lu, Tyler},
  booktitle = {Conference on Uncertainty in Artificial Intelligence},
  year      = {2016},
  url       = {https://mlanthology.org/uai/2016/boutilier2016uai-budget/}
}