A Comparison of Reinforcement Learning Methods for Automatic Guided Vehicle Scheduling

Abstract

ingly being used in manufacturing plants for trans-portation tasks. Optimal scheduling of AGVs is a dif-ficult problem. A learning AGV is very attractive in a manufacturing plant since it is hard to manually opti-mize the scheduling algorithm to each new situation. In this paper we compare four reinforcement learn-ing methods for scheduling AGVs. Q-learning[Watkins and Dayan 921 and R-learning[Schwartz 931 do not use action models. Q-learning optimizes the discounted total reward, while R-learning optimizes the average undiscounted reward per step. ARTDP[Barto et al. to appear] is a discounted method that uses action models. H-learning[Tadepalli and Ok 941 is an undis-counted version of ARTDP based on an algorithm of Jalali and Ferguson[Jalali and Ferguson 891.

Cite

Text

Ok. "A Comparison of Reinforcement Learning Methods for Automatic Guided Vehicle Scheduling." AAAI Conference on Artificial Intelligence, 1994.

Markdown

[Ok. "A Comparison of Reinforcement Learning Methods for Automatic Guided Vehicle Scheduling." AAAI Conference on Artificial Intelligence, 1994.](https://mlanthology.org/aaai/1994/ok1994aaai-comparison/)

BibTeX

@inproceedings{ok1994aaai-comparison,
  title     = {{A Comparison of Reinforcement Learning Methods for Automatic Guided Vehicle Scheduling}},
  author    = {Ok, DoKyeong},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1994},
  pages     = {1482},
  url       = {https://mlanthology.org/aaai/1994/ok1994aaai-comparison/}
}