Modeling Language Tokens as Functionals of Semantic Fields
Abstract
Recent advances in natural language processing have relied heavily on using Transformer-based language models. However, Transformers often require large parameter sizes and model depth. Existing Transformer-free approaches using state-space models demonstrate superiority over Transformers, yet they still lack a neuro-biologically connection to the human brain. This paper proposes ${\it LasF}$, representing ${\bf L}$anguage tokens ${\bf as}$ ${\bf F}$unctionals of semantic fields, to simulate the neuronal behaviors for better language modeling. The ${\it LasF}$ module is equivalent to a nonlinear approximator tailored for sequential data. By replacing the final layers of pre-trained language models with the ${\it LasF}$ module, we obtain ${\it LasF}$-based models. Experiments conducted for standard reading comprehension and question-answering tasks demonstrate that the ${\it LasF}$-based models consistently improve accuracy with fewer parameters. Besides, we use CommonsenseQA’s blind test set to evaluate a full-parameter tuned ${\it LasF}$-based model, which outperforms the prior best ensemble and single models by $0.4%$ and $3.1%$, respectively. Furthermore, our ${\it LasF}$-only language model trained from scratch outperforms existing parameter-efficient language models on standard datasets such as WikiText103 and PennTreebank.
Cite
Text
Pei et al. "Modeling Language Tokens as Functionals of Semantic Fields." International Conference on Machine Learning, 2024.Markdown
[Pei et al. "Modeling Language Tokens as Functionals of Semantic Fields." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/pei2024icml-modeling/)BibTeX
@inproceedings{pei2024icml-modeling,
title = {{Modeling Language Tokens as Functionals of Semantic Fields}},
author = {Pei, Zhengqi and Zhang, Anran and Wang, Shuhui and Huang, Qingming},
booktitle = {International Conference on Machine Learning},
year = {2024},
pages = {40114-40128},
volume = {235},
url = {https://mlanthology.org/icml/2024/pei2024icml-modeling/}
}