Accelerating Large-Scale Inference with Anisotropic Vector Quantization

Abstract

Quantization based techniques are the current state-of-the-art for scaling maximum inner product search to massive databases. Traditional approaches to quantization aim to minimize the reconstruction error of the database points. Based on the observation that for a given query, the database points that have the largest inner products are more relevant, we develop a family of anisotropic quantization loss functions. Under natural statistical assumptions, we show that quantization with these loss functions leads to a new variant of vector quantization that more greatly penalizes the parallel component of a datapoint’s residual relative to its orthogonal component. The proposed approach, whose implementation is open-source, achieves state-of-the-art results on the public benchmarks available at ann-benchmarks.com.

Cite

Text

Guo et al. "Accelerating Large-Scale Inference with Anisotropic Vector Quantization." International Conference on Machine Learning, 2020.

Markdown

[Guo et al. "Accelerating Large-Scale Inference with Anisotropic Vector Quantization." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/guo2020icml-accelerating/)

BibTeX

@inproceedings{guo2020icml-accelerating,
  title     = {{Accelerating Large-Scale Inference with Anisotropic Vector Quantization}},
  author    = {Guo, Ruiqi and Sun, Philip and Lindgren, Erik and Geng, Quan and Simcha, David and Chern, Felix and Kumar, Sanjiv},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {3887-3896},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/guo2020icml-accelerating/}
}