Enhancing LLM Complex Reasoning Capability Through Hyperbolic Geometry
Abstract
In the era of foundation models and large language models (LLMs), Euclidean space is the de facto geometric setting. However, recent studies highlight this choice comes with limitations. We investigate the non-Euclidean characteristics of LLMs on complex reasoning tasks, finding that token embeddings and hidden states exhibit significant degree of hyperbolicity, indicating an underlying hyperbolic structure. To exploit this hyperbolicity, we propose Hyperbolic Low-Rank Adaptation (HoRA), which performs low-rank adaptation fine-tuning on LLMs in hyperbolic space. HoRA operates directly on the hyperbolic manifold, avoiding issues caused by exponential and logarithmic maps when embedding and weight matrices reside in Euclidean space. Experiments show that HoRA obviously improves LLM performance on complex reasoning tasks. Especially the improvement is more obvious, up to 17.30\% over Euclidean LoRA on the hard-level AQuA dataset.
Cite
Text
Yang et al. "Enhancing LLM Complex Reasoning Capability Through Hyperbolic Geometry." ICML 2024 Workshops: LLMs_and_Cognition, 2024.Markdown
[Yang et al. "Enhancing LLM Complex Reasoning Capability Through Hyperbolic Geometry." ICML 2024 Workshops: LLMs_and_Cognition, 2024.](https://mlanthology.org/icmlw/2024/yang2024icmlw-enhancing/)BibTeX
@inproceedings{yang2024icmlw-enhancing,
title = {{Enhancing LLM Complex Reasoning Capability Through Hyperbolic Geometry}},
author = {Yang, Menglin and Feng, Aosong and Xiong, Bo and Liu, Jiahong and King, Irwin and Ying, Rex},
booktitle = {ICML 2024 Workshops: LLMs_and_Cognition},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/yang2024icmlw-enhancing/}
}