Top-K Data Selection via Distributed Sample Quantile Inference

Abstract

We consider the problem of determining the top-k largest measurements from a dataset distributed among a network of n agents with noisy communication links. We show that this scenario can be cast as a distributed convex optimization problem called sample quantile inference, which we solve using a two-time-scale stochastic approximation algorithm. Herein, we prove the algorithm’s convergence in the almost sure sense to an optimal solution. Moreover, our algorithm handles noise and empirically converges to the correct answer within a small number of iterations.

Cite

Text

Zhang and Vasconcelos. "Top-K Data Selection via Distributed Sample Quantile Inference." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.

Markdown

[Zhang and Vasconcelos. "Top-K Data Selection via Distributed Sample Quantile Inference." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.](https://mlanthology.org/l4dc/2023/zhang2023l4dc-topk/)

BibTeX

@inproceedings{zhang2023l4dc-topk,
  title     = {{Top-K Data Selection via Distributed Sample Quantile Inference}},
  author    = {Zhang, Xu and Vasconcelos, Marcos M.},
  booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
  year      = {2023},
  pages     = {813-824},
  volume    = {211},
  url       = {https://mlanthology.org/l4dc/2023/zhang2023l4dc-topk/}
}