Logistic Markov Decision Processes
Abstract
User modeling in advertising and recommendation has typically focused on myopic predictors of user responses. In this work, we consider the long-term decision problem associated with user interaction. We propose a concise specification of long-term interaction dynamics by combining factored dynamic Bayesian networks with logistic predictors of user responses, allowing state-of-the-art prediction models to be seamlessly extended. We show how to solve such models at scale by providing a constraint generation approach for approximate linear programming that overcomes the variable coupling and non-linearity induced by the logistic regression predictor. The efficacy of the approach is demonstrated on advertising domains with up to 2^54 states and 2^39 actions.
Cite
Text
Mladenov et al. "Logistic Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/346Markdown
[Mladenov et al. "Logistic Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/mladenov2017ijcai-logistic/) doi:10.24963/IJCAI.2017/346BibTeX
@inproceedings{mladenov2017ijcai-logistic,
title = {{Logistic Markov Decision Processes}},
author = {Mladenov, Martin and Boutilier, Craig and Schuurmans, Dale and Meshi, Ofer and Elidan, Gal and Lu, Tyler},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2017},
pages = {2486-2493},
doi = {10.24963/IJCAI.2017/346},
url = {https://mlanthology.org/ijcai/2017/mladenov2017ijcai-logistic/}
}