RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression

Abstract

Symbolic regression is a key task in machine learning, aiming to discover mathematical expressions that best describe a dataset. While deep learning has increased interest in using neural networks for symbolic regression, many existing approaches rely on pre-trained models. These models require significant computational resources and struggle with regression tasks involving unseen functions and variables. A pre-training-free paradigm is needed to better integrate with search-based symbolic regression algorithms. To address these limitations, we propose a novel framework for symbolic regression that integrates evolutionary feature construction with a neural network, without the need for pre-training. Our approach adaptively generates symbolic trees that align with the desired semantics in real-time using a language model trained via online supervised learning, providing effective building blocks for feature construction. To mitigate hallucinations from the language model, we design a retrieval-augmented generation mechanism that explicitly leverages searched symbolic expressions. Additionally, we introduce a scale-invariant data augmentation technique that further improves the robustness and generalization of the model. Experimental results demonstrate that our framework achieves state-of-the-art accuracy across 25 regression algorithms and 120 regression tasks.

Cite

Text

Zhang et al. "RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression." International Conference on Learning Representations, 2025.

Markdown

[Zhang et al. "RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/zhang2025iclr-ragsr/)

BibTeX

@inproceedings{zhang2025iclr-ragsr,
  title     = {{RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression}},
  author    = {Zhang, Hengzhe and Chen, Qi and Xue, Bing and Banzhaf, Wolfgang and Zhang, Mengjie},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/zhang2025iclr-ragsr/}
}