Orvieto, Antonio
46 publications
NeurIPS
2024
Recurrent Neural Networks: Vanishing and Exploding Gradients Are Not the End of the Story
NeurIPS
2024
Understanding the Differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
CVPR
2023
Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning
NeurIPSW
2022
Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning
NeurIPS
2022
Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution
NeurIPS
2022
Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse