Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach

Abstract

This paper describes the Q-routing algorithm for packet routing, in which a reinforcement learning module is embedded into each node of a switching network. Only local communication is used by each node to keep accurate statistics on which routing decisions lead to minimal delivery times. In simple experiments involving a 36-node, irregularly connected network, Q-routing proves supe(cid:173) rior to a nonadaptive algorithm based on precomputed shortest paths and is able to route efficiently even when critical aspects of the simulation, such as the network load, are allowed to vary dy(cid:173) namically. The paper concludes with a discussion of the tradeoff between discovering shortcuts and maintaining stable policies.

Cite

Text

Boyan and Littman. "Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach." Neural Information Processing Systems, 1993.

Markdown

[Boyan and Littman. "Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach." Neural Information Processing Systems, 1993.](https://mlanthology.org/neurips/1993/boyan1993neurips-packet/)

BibTeX

@inproceedings{boyan1993neurips-packet,
  title     = {{Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach}},
  author    = {Boyan, Justin A. and Littman, Michael L.},
  booktitle = {Neural Information Processing Systems},
  year      = {1993},
  pages     = {671-678},
  url       = {https://mlanthology.org/neurips/1993/boyan1993neurips-packet/}
}