Learning to Control Dynamic Systems with Automatic Quantization
Abstract
Reinforcement learning is often used in learning to control dynamic systems, which are described by quantitative state variables. Most previous work that learns qualitative (symbolic) control rules cannot construct symbols themselves. That is, a correct partition of the state variables, or a correct set of qualitative symbols, is given to the learning program. We do not make this assumption in our work of learning to control dynamic systems. The learning task is divided into two phases. The first phase is to extract symbols from quantitative inputs. This process is also commonly called quantization. The second phase is to evaluate the symbols obtained in the first phase and to induce the best possible symbolic rules based on those symbols. These two phases interact with each other and thus make the whole learning task very difficult. We demonstrate that our new method, called STAQ (Set Training with Automatic Quantization), can aggressively partition the input variables to a finer resolution until the correct control rules based on these partitions (symbols) are learned. In particular, we use STAQ to solve the well-known cart-pole balancing problem.
Cite
Text
Ling and Buchal. "Learning to Control Dynamic Systems with Automatic Quantization." European Conference on Machine Learning, 1993. doi:10.1007/3-540-56602-3_153Markdown
[Ling and Buchal. "Learning to Control Dynamic Systems with Automatic Quantization." European Conference on Machine Learning, 1993.](https://mlanthology.org/ecmlpkdd/1993/ling1993ecml-learning/) doi:10.1007/3-540-56602-3_153BibTeX
@inproceedings{ling1993ecml-learning,
title = {{Learning to Control Dynamic Systems with Automatic Quantization}},
author = {Ling, Charles X. and Buchal, Ralph},
booktitle = {European Conference on Machine Learning},
year = {1993},
pages = {372-377},
doi = {10.1007/3-540-56602-3_153},
url = {https://mlanthology.org/ecmlpkdd/1993/ling1993ecml-learning/}
}