Exploration of Tree-Based Hierarchical SoftMax for Recurrent Language Models

Jiang, Nan; Rong, Wenge; Gao, Min; Shen, Yikang; Xiong, Zhang

doi:10.24963/IJCAI.2017/271

Exploration of Tree-Based Hierarchical SoftMax for Recurrent Language Models

Nan Jiang, Wenge Rong, Min Gao, Yikang Shen, Zhang Xiong

IJCAI 2017 pp. 1951-1957

doi:10.24963/IJCAI.2017/271 /ijcai/2017/jiang2017ijcai-exploration/

Abstract

Recently, variants of neural networks for computational linguistics have been proposed and successfully applied to neural language modeling and neural machine translation. These neural models can leverage knowledge from massive corpora but they are extremely slow as they predict candidate words from a large vocabulary during training and inference. As an alternative to gradient approximation and softmax with class decomposition, we explore the tree-based hierarchical softmax method and reform its architecture, making it compatible with modern GPUs and introducing a compact tree-based loss function. When combined with several word hierarchical clustering algorithms, improved performance is achieved in language modelling task with intrinsic evaluation criterions on PTB, WikiText-2 and WikiText-103 datasets.

PDF IJCAI Semantic Scholar

Cite

Text

Jiang et al. "Exploration of Tree-Based Hierarchical SoftMax for Recurrent Language Models." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/271

Markdown

[Jiang et al. "Exploration of Tree-Based Hierarchical SoftMax for Recurrent Language Models." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/jiang2017ijcai-exploration/) doi:10.24963/IJCAI.2017/271

BibTeX

@inproceedings{jiang2017ijcai-exploration,
  title     = {{Exploration of Tree-Based Hierarchical SoftMax for Recurrent Language Models}},
  author    = {Jiang, Nan and Rong, Wenge and Gao, Min and Shen, Yikang and Xiong, Zhang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {1951-1957},
  doi       = {10.24963/IJCAI.2017/271},
  url       = {https://mlanthology.org/ijcai/2017/jiang2017ijcai-exploration/}
}