Hierarchical Credit Allocation in a Classifier System

Abstract

Learning systems which engage in sequential activity face the problem of properly allocating credit to steps or actions which make possible later steps that result in environmental payoff. In the classifier systems studied by Holland and others, credit is allocated by means of a algorithm through which, over time, environmental payoff in effect Bows back to classifiers which take early, stage-setting actions. The algorithm has advantages of simplicity and locality, but may not adequately reinforce long action sequences. We suggest an alternative form for the algorithm and the system's operating principles designed to induce behavioral hierarchies in which modularity of the hierarchy would keep all bucket-brigade chains short, thus more reinforceable and more rapidly learned, but overall action sequences could be long.

Cite

Text

Wilson. "Hierarchical Credit Allocation in a Classifier System." International Joint Conference on Artificial Intelligence, 1987.

Markdown

[Wilson. "Hierarchical Credit Allocation in a Classifier System." International Joint Conference on Artificial Intelligence, 1987.](https://mlanthology.org/ijcai/1987/wilson1987ijcai-hierarchical/)

BibTeX

@inproceedings{wilson1987ijcai-hierarchical,
  title     = {{Hierarchical Credit Allocation in a Classifier System}},
  author    = {Wilson, Stewart W.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1987},
  pages     = {217-220},
  url       = {https://mlanthology.org/ijcai/1987/wilson1987ijcai-hierarchical/}
}