Predictive Q-Routing: A Memory-Based Reinforcement Learning Approach to Adaptive Traffic Control

Abstract

In this paper, we propose a memory-based Q-Iearning algorithm called predictive Q-routing (PQ-routing) for adaptive traffic con(cid:173) trol. We attempt to address two problems encountered in Q-routing (Boyan & Littman, 1994), namely, the inability to fine-tune rout(cid:173) ing policies under low network load and the inability to learn new optimal policies under decreasing load conditions. Unlike other memory-based reinforcement learning algorithms in which mem(cid:173) ory is used to keep past experiences to increase learning speed, PQ-routing keeps the best experiences learned and reuses them by predicting the traffic trend. The effectiveness of PQ-routing has been verified under various network topologies and traffic con(cid:173) ditions. Simulation results show that PQ-routing is superior to Q-routing in terms of both learning speed and adaptability.

Cite

Text

Choi and Yeung. "Predictive Q-Routing: A Memory-Based Reinforcement Learning Approach to Adaptive Traffic Control." Neural Information Processing Systems, 1995.

Markdown

[Choi and Yeung. "Predictive Q-Routing: A Memory-Based Reinforcement Learning Approach to Adaptive Traffic Control." Neural Information Processing Systems, 1995.](https://mlanthology.org/neurips/1995/choi1995neurips-predictive/)

BibTeX

@inproceedings{choi1995neurips-predictive,
  title     = {{Predictive Q-Routing: A Memory-Based Reinforcement Learning Approach to Adaptive Traffic Control}},
  author    = {Choi, Samuel P. M. and Yeung, Dit-Yan},
  booktitle = {Neural Information Processing Systems},
  year      = {1995},
  pages     = {945-951},
  url       = {https://mlanthology.org/neurips/1995/choi1995neurips-predictive/}
}