A Comparison of Reinforcement Learning Methods for Automatic Guided Vehicle Scheduling
Abstract
ingly being used in manufacturing plants for trans-portation tasks. Optimal scheduling of AGVs is a dif-ficult problem. A learning AGV is very attractive in a manufacturing plant since it is hard to manually opti-mize the scheduling algorithm to each new situation. In this paper we compare four reinforcement learn-ing methods for scheduling AGVs. Q-learning[Watkins and Dayan 921 and R-learning[Schwartz 931 do not use action models. Q-learning optimizes the discounted total reward, while R-learning optimizes the average undiscounted reward per step. ARTDP[Barto et al. to appear] is a discounted method that uses action models. H-learning[Tadepalli and Ok 941 is an undis-counted version of ARTDP based on an algorithm of Jalali and Ferguson[Jalali and Ferguson 891.
Cite
Text
Ok. "A Comparison of Reinforcement Learning Methods for Automatic Guided Vehicle Scheduling." AAAI Conference on Artificial Intelligence, 1994.Markdown
[Ok. "A Comparison of Reinforcement Learning Methods for Automatic Guided Vehicle Scheduling." AAAI Conference on Artificial Intelligence, 1994.](https://mlanthology.org/aaai/1994/ok1994aaai-comparison/)BibTeX
@inproceedings{ok1994aaai-comparison,
title = {{A Comparison of Reinforcement Learning Methods for Automatic Guided Vehicle Scheduling}},
author = {Ok, DoKyeong},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {1994},
pages = {1482},
url = {https://mlanthology.org/aaai/1994/ok1994aaai-comparison/}
}