Role of Parametrization in Learning Dynamics of Recurrent Neural Networks

Abstract

The characteristics of the loss landscape are vital for ensuring efficient gradient-based optimization of recurrent neural networks (RNNs). Learning dynamics in continuous-time RNNs are prone to plateauing effects, with recent studies focusing on this issue by analyzing loss landscapes, particularly in the setting of linear time-invariant (LTI) systems. Building on this work, we explore a fairly simplified setting and study the loss landscape under modal and canonical parametrizations, derived from their respective state-space realizations. We find that canonical parametrization offers improved quasi-convexity properties and faster learning compared to modal forms. Theoretical results are corroborated by numerical experiments. We also show that autonomous ReLU-based RNNs in a modal structure generate trajectories which can be produced by an LTI systems while those with a canonical structure produce more complex trajectories beyond the scope of LTI systems.

Cite

Text

Datar et al. "Role of Parametrization in Learning Dynamics of Recurrent Neural Networks." NeurIPS 2024 Workshops: OPT, 2024.

Markdown

[Datar et al. "Role of Parametrization in Learning Dynamics of Recurrent Neural Networks." NeurIPS 2024 Workshops: OPT, 2024.](https://mlanthology.org/neuripsw/2024/datar2024neuripsw-role/)

BibTeX

@inproceedings{datar2024neuripsw-role,
  title     = {{Role of Parametrization in Learning Dynamics of Recurrent Neural Networks}},
  author    = {Datar, Adwait and Datar, Chinmay and Monfared, Zahra and Dietrich, Felix},
  booktitle = {NeurIPS 2024 Workshops: OPT},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/datar2024neuripsw-role/}
}