Whittle Index with Multiple Actions and State Constraint for Inventory Management
Abstract
Whittle index is a heuristic tool that leads to good performance for the restless bandits problem. In this paper, we extend Whittle index to a new multi-agent reinforcement learning (MARL) setting with multiple discrete actions and a possibly changing constraint on the state space, resulting in WIMS (Whittle Index with Multiple actions and State constraint). This setting is common for inventory management where each agent chooses a replenishing quantity level for the corresponding stock-keeping-unit (SKU) such that the total profit is maximized while the total inventory does not exceed a certain limit. Accordingly, we propose a deep MARL algorithm based on WIMS for inventory management. Empirically, our algorithm is evaluated on real large-scale inventory management problems with up to 2307 SKUs and outperforms operation-research-based methods and baseline MARL algorithms.
Cite
Text
Zhang et al. "Whittle Index with Multiple Actions and State Constraint for Inventory Management." International Conference on Learning Representations, 2024.Markdown
[Zhang et al. "Whittle Index with Multiple Actions and State Constraint for Inventory Management." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/zhang2024iclr-whittle/)BibTeX
@inproceedings{zhang2024iclr-whittle,
title = {{Whittle Index with Multiple Actions and State Constraint for Inventory Management}},
author = {Zhang, Chuheng and Wang, Xiangsen and Jiang, Wei and Yang, Xianliang and Wang, Siwei and Song, Lei and Bian, Jiang},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/zhang2024iclr-whittle/}
}