Language Models as Hierarchy Encoders

He, Yuan; Yuan, Zhangdie; Chen, Jiaoyan; Horrocks, Ian

doi:10.52202/079017-0469

Language Models as Hierarchy Encoders

Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks

NeurIPS 2024

doi:10.52202/079017-0469 /neurips/2024/he2024neurips-language/

Abstract

Interpreting hierarchical structures latent in language is a key limitation of current language models (LMs). While previous research has implicitly leveraged these hierarchies to enhance LMs, approaches for their explicit encoding are yet to be explored. To address this, we introduce a novel approach to re-train transformer encoder-based LMs as Hierarchy Transformer encoders (HiTs), harnessing the expansive nature of hyperbolic space. Our method situates the output embedding space of pre-trained LMs within a Poincaré ball with a curvature that adapts to the embedding dimension, followed by re-training on hyperbolic clustering and centripetal losses. These losses are designed to effectively cluster related entities (input as texts) and organise them hierarchically. We evaluate HiTs against pre-trained LMs, standard fine-tuned LMs, and several hyperbolic embedding baselines, focusing on their capabilities in simulating transitive inference, predicting subsumptions, and transferring knowledge across hierarchies. The results demonstrate that HiTs consistently outperform all baselines in these tasks, underscoring the effectiveness and transferability of our re-trained hierarchy encoders.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

He et al. "Language Models as Hierarchy Encoders." Neural Information Processing Systems, 2024. doi:10.52202/079017-0469

Markdown

[He et al. "Language Models as Hierarchy Encoders." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/he2024neurips-language/) doi:10.52202/079017-0469

BibTeX

@inproceedings{he2024neurips-language,
  title     = {{Language Models as Hierarchy Encoders}},
  author    = {He, Yuan and Yuan, Zhangdie and Chen, Jiaoyan and Horrocks, Ian},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-0469},
  url       = {https://mlanthology.org/neurips/2024/he2024neurips-language/}
}