On the Theoretical Limit of Gradient Descent for Simple Recurrent Neural Networks with Finite Precision
Abstract
Despite their great practical successes, the understanding of neural network behavior is still a topical research issue. In particular, the class of functions learnable in the context of a finite precision configuration is an open question. In this paper, we propose to study the limits of gradient descent when such a configuration is set for the class of Simple Recurrent Networks (SRN). We exhibit conditions under which the gradient descend will provably fail. We also design a class of SRN based on Deterministic finite State Automata (DFA) that fulfills the failure requirements. The definition of this class is constructive: we propose an algorithm that, from any DFA, constructs a SRN that computes exactly the same function, a result of interest by its own.
Cite
Text
Mitarchuk et al. "On the Theoretical Limit of Gradient Descent for Simple Recurrent Neural Networks with Finite Precision." Transactions on Machine Learning Research, 2024.Markdown
[Mitarchuk et al. "On the Theoretical Limit of Gradient Descent for Simple Recurrent Neural Networks with Finite Precision." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/mitarchuk2024tmlr-theoretical/)BibTeX
@article{mitarchuk2024tmlr-theoretical,
title = {{On the Theoretical Limit of Gradient Descent for Simple Recurrent Neural Networks with Finite Precision}},
author = {Mitarchuk, Volodimir and Emonet, Rémi and Eyraud, Remi and Habrard, Amaury},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/mitarchuk2024tmlr-theoretical/}
}