Approximate Policy Iteration Using Large-Margin Classifiers

Lagoudakis, Michail G.; Parr, Ronald

Approximate Policy Iteration Using Large-Margin Classifiers

IJCAI 2003 pp. 1432-1434

/ijcai/2003/lagoudakis2003ijcai-approximate/

Abstract

We present an approximate policy iteration algorithm that uses rollouts to estimate the value of each action under a given policy in a subset of states and a classifier to generalize and learn the improved policy over the entire state space. Using a multiclass support vector machine as the classifier, we obtained successful results on the inverted pendulum and the bicycle balancing and riding domains.

PDF Semantic Scholar

Cite

Text

Lagoudakis and Parr. "Approximate Policy Iteration Using Large-Margin Classifiers." International Joint Conference on Artificial Intelligence, 2003.

Markdown

[Lagoudakis and Parr. "Approximate Policy Iteration Using Large-Margin Classifiers." International Joint Conference on Artificial Intelligence, 2003.](https://mlanthology.org/ijcai/2003/lagoudakis2003ijcai-approximate/)

BibTeX

@inproceedings{lagoudakis2003ijcai-approximate,
  title     = {{Approximate Policy Iteration Using Large-Margin Classifiers}},
  author    = {Lagoudakis, Michail G. and Parr, Ronald},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2003},
  pages     = {1432-1434},
  url       = {https://mlanthology.org/ijcai/2003/lagoudakis2003ijcai-approximate/}
}