Logistic Markov Decision Processes

Abstract

User modeling in advertising and recommendation has typically focused on myopic predictors of user responses. In this work, we consider the long-term decision problem associated with user interaction. We propose a concise specification of long-term interaction dynamics by combining factored dynamic Bayesian networks with logistic predictors of user responses, allowing state-of-the-art prediction models to be seamlessly extended. We show how to solve such models at scale by providing a constraint generation approach for approximate linear programming that overcomes the variable coupling and non-linearity induced by the logistic regression predictor. The efficacy of the approach is demonstrated on advertising domains with up to 2^54 states and 2^39 actions.

Cite

Text

Mladenov et al. "Logistic Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/346

Markdown

[Mladenov et al. "Logistic Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/mladenov2017ijcai-logistic/) doi:10.24963/IJCAI.2017/346

BibTeX

@inproceedings{mladenov2017ijcai-logistic,
  title     = {{Logistic Markov Decision Processes}},
  author    = {Mladenov, Martin and Boutilier, Craig and Schuurmans, Dale and Meshi, Ofer and Elidan, Gal and Lu, Tyler},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {2486-2493},
  doi       = {10.24963/IJCAI.2017/346},
  url       = {https://mlanthology.org/ijcai/2017/mladenov2017ijcai-logistic/}
}