Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach
Abstract
This paper describes the Q-routing algorithm for packet routing, in which a reinforcement learning module is embedded into each node of a switching network. Only local communication is used by each node to keep accurate statistics on which routing decisions lead to minimal delivery times. In simple experiments involving a 36-node, irregularly connected network, Q-routing proves supe(cid:173) rior to a nonadaptive algorithm based on precomputed shortest paths and is able to route efficiently even when critical aspects of the simulation, such as the network load, are allowed to vary dy(cid:173) namically. The paper concludes with a discussion of the tradeoff between discovering shortcuts and maintaining stable policies.
Cite
Text
Boyan and Littman. "Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach." Neural Information Processing Systems, 1993.Markdown
[Boyan and Littman. "Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach." Neural Information Processing Systems, 1993.](https://mlanthology.org/neurips/1993/boyan1993neurips-packet/)BibTeX
@inproceedings{boyan1993neurips-packet,
title = {{Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach}},
author = {Boyan, Justin A. and Littman, Michael L.},
booktitle = {Neural Information Processing Systems},
year = {1993},
pages = {671-678},
url = {https://mlanthology.org/neurips/1993/boyan1993neurips-packet/}
}