The Shortest Path Problem Under Partial Monitoring
Abstract
The on-line shortest path problem is considered under partial monitoring scenarios. At each round, a decision maker has to choose a path between two distinguished vertices of a weighted directed acyclic graph whose edge weights can change in an arbitrary (adversarial) way such that the loss of the chosen path (defined as the sum of the weights of its composing edges) be small. In the multi-armed bandit setting, after choosing a path, the decision maker learns only the weights of those edges that belong to the chosen path. For this scenario, an algorithm is given whose average cumulative loss in n rounds exceeds that of the best path, matched off-line to the entire sequence of the edge weights, by a quantity that is proportional to $1/\sqrt{n}$ and depends only polynomially on the number of edges of the graph. The algorithm can be implemented with linear complexity in the number of rounds n and in the number of edges. This result improves earlier bandit-algorithms which have performance bounds that either depend exponentially on the number of edges or converge to zero at a slower rate than $O(1/\sqrt{n})$ . An extension to the so-called label efficient setting is also given, where the decision maker is informed about the weight of the chosen path only with probability ε <1. Applications to routing in packet switched networks along with simulation results are also presented.
Cite
Text
György et al. "The Shortest Path Problem Under Partial Monitoring." Annual Conference on Computational Learning Theory, 2006. doi:10.1007/11776420_35Markdown
[György et al. "The Shortest Path Problem Under Partial Monitoring." Annual Conference on Computational Learning Theory, 2006.](https://mlanthology.org/colt/2006/gyorgy2006colt-shortest/) doi:10.1007/11776420_35BibTeX
@inproceedings{gyorgy2006colt-shortest,
title = {{The Shortest Path Problem Under Partial Monitoring}},
author = {György, András and Linder, Tamás and Ottucsák, György},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2006},
pages = {468-482},
doi = {10.1007/11776420_35},
url = {https://mlanthology.org/colt/2006/gyorgy2006colt-shortest/}
}