Hyperbolic Attention Networks

Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas

ICLR 2019

/iclr/2019/gulcehre2019iclr-hyperbolic/

Abstract

Recent approaches have successfully demonstrated the benefits of learning the parameters of shallow networks in hyperbolic space. We extend this line of work by imposing hyperbolic geometry on the embeddings used to compute the ubiquitous attention mechanisms for different neural networks architectures. By only changing the geometry of embedding of object representations, we can use the embedding space more efficiently without increasing the number of parameters of the model. Mainly as the number of objects grows exponentially for any semantic distance from the query, hyperbolic geometry --as opposed to Euclidean geometry-- can encode those objects without having any interference. Our method shows improvements in generalization on neural machine translation on WMT'14 (English to German), learning on graphs (both on synthetic and real-world graph tasks) and visual question answering (CLEVR) tasks while keeping the neural representations compact.

PDF ICLR Semantic Scholar

Cite

Text

Gulcehre et al. "Hyperbolic Attention Networks." International Conference on Learning Representations, 2019.

Markdown

[Gulcehre et al. "Hyperbolic Attention Networks." International Conference on Learning Representations, 2019.](https://mlanthology.org/iclr/2019/gulcehre2019iclr-hyperbolic/)

BibTeX

@inproceedings{gulcehre2019iclr-hyperbolic,
  title     = {{Hyperbolic Attention Networks}},
  author    = {Gulcehre, Caglar and Denil, Misha and Malinowski, Mateusz and Razavi, Ali and Pascanu, Razvan and Hermann, Karl Moritz and Battaglia, Peter and Bapst, Victor and Raposo, David and Santoro, Adam and de Freitas, Nando},
  booktitle = {International Conference on Learning Representations},
  year      = {2019},
  url       = {https://mlanthology.org/iclr/2019/gulcehre2019iclr-hyperbolic/}
}