Modeling Language Tokens as Functionals of Semantic Fields

Abstract

Recent advances in natural language processing have relied heavily on using Transformer-based language models. However, Transformers often require large parameter sizes and model depth. Existing Transformer-free approaches using state-space models demonstrate superiority over Transformers, yet they still lack a neuro-biologically connection to the human brain. This paper proposes ${\it LasF}$, representing ${\bf L}$anguage tokens ${\bf as}$ ${\bf F}$unctionals of semantic fields, to simulate the neuronal behaviors for better language modeling. The ${\it LasF}$ module is equivalent to a nonlinear approximator tailored for sequential data. By replacing the final layers of pre-trained language models with the ${\it LasF}$ module, we obtain ${\it LasF}$-based models. Experiments conducted for standard reading comprehension and question-answering tasks demonstrate that the ${\it LasF}$-based models consistently improve accuracy with fewer parameters. Besides, we use CommonsenseQA’s blind test set to evaluate a full-parameter tuned ${\it LasF}$-based model, which outperforms the prior best ensemble and single models by $0.4%$ and $3.1%$, respectively. Furthermore, our ${\it LasF}$-only language model trained from scratch outperforms existing parameter-efficient language models on standard datasets such as WikiText103 and PennTreebank.

Cite

Text

Pei et al. "Modeling Language Tokens as Functionals of Semantic Fields." International Conference on Machine Learning, 2024.

Markdown

[Pei et al. "Modeling Language Tokens as Functionals of Semantic Fields." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/pei2024icml-modeling/)

BibTeX

@inproceedings{pei2024icml-modeling,
  title     = {{Modeling Language Tokens as Functionals of Semantic Fields}},
  author    = {Pei, Zhengqi and Zhang, Anran and Wang, Shuhui and Huang, Qingming},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {40114-40128},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/pei2024icml-modeling/}
}