Top-K Data Selection via Distributed Sample Quantile Inference

L4DC 2023 pp. 813-824

/l4dc/2023/zhang2023l4dc-topk/

Abstract

We consider the problem of determining the top-k largest measurements from a dataset distributed among a network of n agents with noisy communication links. We show that this scenario can be cast as a distributed convex optimization problem called sample quantile inference, which we solve using a two-time-scale stochastic approximation algorithm. Herein, we prove the algorithm’s convergence in the almost sure sense to an optimal solution. Moreover, our algorithm handles noise and empirically converges to the correct answer within a small number of iterations.

PDF L4DC OpenReview Semantic Scholar

Cite

Text

Zhang and Vasconcelos. "Top-K Data Selection via Distributed Sample Quantile Inference." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.

Markdown

[Zhang and Vasconcelos. "Top-K Data Selection via Distributed Sample Quantile Inference." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.](https://mlanthology.org/l4dc/2023/zhang2023l4dc-topk/)

BibTeX

@inproceedings{zhang2023l4dc-topk,
  title     = {{Top-K Data Selection via Distributed Sample Quantile Inference}},
  author    = {Zhang, Xu and Vasconcelos, Marcos M.},
  booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
  year      = {2023},
  pages     = {813-824},
  volume    = {211},
  url       = {https://mlanthology.org/l4dc/2023/zhang2023l4dc-topk/}
}