Deep Symbolic Regression: Recovering Mathematical Expressions from Data via Risk-Seeking Policy Gradients

Brenden K Petersen, Mikel Landajuela Larma, Terrell N. Mundhenk, Claudio Prata Santiago, Soo Kyung Kim, Joanne Taery Kim

ICLR 2021

/iclr/2021/petersen2021iclr-deep/

Abstract

Discovering the underlying mathematical expressions describing a dataset is a core challenge for artificial intelligence. This is the problem of $\textit{symbolic regression}$. Despite recent advances in training neural networks to solve complex tasks, deep learning approaches to symbolic regression are underexplored. We propose a framework that leverages deep learning for symbolic regression via a simple idea: use a large model to search the space of small models. Specifically, we use a recurrent neural network to emit a distribution over tractable mathematical expressions and employ a novel risk-seeking policy gradient to train the network to generate better-fitting expressions. Our algorithm outperforms several baseline methods (including Eureqa, the gold standard for symbolic regression) in its ability to exactly recover symbolic expressions on a series of benchmark problems, both with and without added noise. More broadly, our contributions include a framework that can be applied to optimize hierarchical, variable-length objects under a black-box performance metric, with the ability to incorporate constraints in situ, and a risk-seeking policy gradient formulation that optimizes for best-case performance instead of expected performance.

PDF ICLR Code Semantic Scholar

Cite

Text

Petersen et al. "Deep Symbolic Regression: Recovering Mathematical Expressions from Data via Risk-Seeking Policy Gradients." International Conference on Learning Representations, 2021.

Markdown

[Petersen et al. "Deep Symbolic Regression: Recovering Mathematical Expressions from Data via Risk-Seeking Policy Gradients." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/petersen2021iclr-deep/)

BibTeX

@inproceedings{petersen2021iclr-deep,
  title     = {{Deep Symbolic Regression: Recovering Mathematical Expressions from Data via Risk-Seeking Policy Gradients}},
  author    = {Petersen, Brenden K and Larma, Mikel Landajuela and Mundhenk, Terrell N. and Santiago, Claudio Prata and Kim, Soo Kyung and Kim, Joanne Taery},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/petersen2021iclr-deep/}
}