Successive Halving Top-K Operator
Abstract
We propose a differentiable successive halving method of relaxing the top-k operator, rendering gradient-based optimization possible. The need to perform softmax iteratively on the entire vector of scores is avoided using a tournament-style selection. As a result, a much better approximation of top-k and lower computational cost is achieved compared to the previous approach.
Cite
Text
Pietruszka et al. "Successive Halving Top-K Operator." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I18.17931Markdown
[Pietruszka et al. "Successive Halving Top-K Operator." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/pietruszka2021aaai-successive/) doi:10.1609/AAAI.V35I18.17931BibTeX
@inproceedings{pietruszka2021aaai-successive,
title = {{Successive Halving Top-K Operator}},
author = {Pietruszka, Michal and Borchmann, Lukasz and Gralinski, Filip},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {15869-15870},
doi = {10.1609/AAAI.V35I18.17931},
url = {https://mlanthology.org/aaai/2021/pietruszka2021aaai-successive/}
}