Balancing Multiple Sources of Reward in Reinforcement Learning
Abstract
For many problems which would be natural for reinforcement learning, the reward signal is not a single scalar value but has multiple scalar com(cid:173) ponents. Examples of such problems include agents with multiple goals and agents with multiple users. Creating a single reward value by com(cid:173) bining the multiple components can throwaway vital information and can lead to incorrect solutions. We describe the multiple reward source problem and discuss the problems with applying traditional reinforce(cid:173) ment learning. We then present an new algorithm for finding a solution and results on simulated environments.
Cite
Text
Shelton. "Balancing Multiple Sources of Reward in Reinforcement Learning." Neural Information Processing Systems, 2000.Markdown
[Shelton. "Balancing Multiple Sources of Reward in Reinforcement Learning." Neural Information Processing Systems, 2000.](https://mlanthology.org/neurips/2000/shelton2000neurips-balancing/)BibTeX
@inproceedings{shelton2000neurips-balancing,
title = {{Balancing Multiple Sources of Reward in Reinforcement Learning}},
author = {Shelton, Christian R.},
booktitle = {Neural Information Processing Systems},
year = {2000},
pages = {1082-1088},
url = {https://mlanthology.org/neurips/2000/shelton2000neurips-balancing/}
}