Role of Parametrization in Learning Dynamics of Recurrent Neural Networks
Abstract
The characteristics of the loss landscape are vital for ensuring efficient gradient-based optimization of recurrent neural networks (RNNs). Learning dynamics in continuous-time RNNs are prone to plateauing effects, with recent studies focusing on this issue by analyzing loss landscapes, particularly in the setting of linear time-invariant (LTI) systems. Building on this work, we explore a fairly simplified setting and study the loss landscape under modal and canonical parametrizations, derived from their respective state-space realizations. We find that canonical parametrization offers improved quasi-convexity properties and faster learning compared to modal forms. Theoretical results are corroborated by numerical experiments. We also show that autonomous ReLU-based RNNs in a modal structure generate trajectories which can be produced by an LTI systems while those with a canonical structure produce more complex trajectories beyond the scope of LTI systems.
Cite
Text
Datar et al. "Role of Parametrization in Learning Dynamics of Recurrent Neural Networks." NeurIPS 2024 Workshops: OPT, 2024.Markdown
[Datar et al. "Role of Parametrization in Learning Dynamics of Recurrent Neural Networks." NeurIPS 2024 Workshops: OPT, 2024.](https://mlanthology.org/neuripsw/2024/datar2024neuripsw-role/)BibTeX
@inproceedings{datar2024neuripsw-role,
title = {{Role of Parametrization in Learning Dynamics of Recurrent Neural Networks}},
author = {Datar, Adwait and Datar, Chinmay and Monfared, Zahra and Dietrich, Felix},
booktitle = {NeurIPS 2024 Workshops: OPT},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/datar2024neuripsw-role/}
}