Successive Halving Top-K Operator

Abstract

We propose a differentiable successive halving method of relaxing the top-k operator, rendering gradient-based optimization possible. The need to perform softmax iteratively on the entire vector of scores is avoided using a tournament-style selection. As a result, a much better approximation of top-k and lower computational cost is achieved compared to the previous approach.

Cite

Text

Pietruszka et al. "Successive Halving Top-K Operator." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I18.17931

Markdown

[Pietruszka et al. "Successive Halving Top-K Operator." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/pietruszka2021aaai-successive/) doi:10.1609/AAAI.V35I18.17931

BibTeX

@inproceedings{pietruszka2021aaai-successive,
  title     = {{Successive Halving Top-K Operator}},
  author    = {Pietruszka, Michal and Borchmann, Lukasz and Gralinski, Filip},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {15869-15870},
  doi       = {10.1609/AAAI.V35I18.17931},
  url       = {https://mlanthology.org/aaai/2021/pietruszka2021aaai-successive/}
}