Orvieto, Antonio

46 publications

ICLR 2025 Adaptive Methods Through the Lens of SDEs: Theoretical Insights on the Role of Noise Enea Monzio Compagnoni, Tianlin Liu, Rustem Islamov, Frank Norbert Proske, Antonio Orvieto, Aurelien Lucchi
COLT 2025 An Uncertainty Principle for Linear Recurrent Neural Networks Alexandre François, Antonio Orvieto, Francis Bach
ICLRW 2025 Can You Finetune Your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models Fay Elhassan, Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping
NeurIPS 2025 Enhancing Optimizer Stability: Momentum Adaptation of the NGN Step-Size Rustem Islamov, Niccolò Ajroldi, Antonio Orvieto, Aurelien Lucchi
ICLRW 2025 Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations Sajad Movahedi, Felix Sarnthein, Nicola Muca Cirone, Antonio Orvieto
NeurIPS 2025 Fixed-Point RNNs: Interpolating from Diagonal to Dense Sajad Movahedi, Felix Sarnthein, Nicola Muca Cirone, Antonio Orvieto
ICML 2025 Generalized Interpolating Discrete Diffusion Dimitri Von Rütte, Janis Fluri, Yuhui Ding, Antonio Orvieto, Bernhard Schölkopf, Thomas Hofmann
NeurIPS 2025 Generalized Linear Mode Connectivity for Transformers Alexander Theus, Alessandro Cabodi, Sotiris Anagnostidis, Antonio Orvieto, Sidak Pal Singh, Valentina Boeva
ICLR 2025 Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture Sajad Movahedi, Antonio Orvieto, Seyed-Mohsen Moosavi-Dezfooli
ICLRW 2025 Hyper-Align: Efficient Modality Alignment via Hypernetworks Jaisidh Singh, Diganta Misra, Boris Knyazev, Antonio Orvieto
NeurIPS 2025 In Search of Adam’s Secret Sauce Antonio Orvieto, Robert M. Gower
ICLRW 2025 Revisiting Associative Recall in Modern Recurrent Models Destiny Okpekpe, Antonio Orvieto
ICML 2025 When, Where and Why to Average Weights? Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping
NeurIPS 2024 Loss Landscape Characterization of Neural Networks Without Over-Parametrization Rustem Islamov, Niccoló Ajroldi, Antonio Orvieto, Aurelien Lucchi
ICML 2024 Recurrent Distance Filtering for Graph Representation Learning Yuhui Ding, Antonio Orvieto, Bobby He, Thomas Hofmann
NeurIPS 2024 Recurrent Neural Networks: Vanishing and Exploding Gradients Are Not the End of the Story Nicolas Zucchet, Antonio Orvieto
AISTATS 2024 SDEs for Minimax Optimization Enea Monzio Compagnoni, Antonio Orvieto, Hans Kersting, Frank Proske, Aurelien Lucchi
NeurIPS 2024 Super Consistency of Neural Network Landscapes and Learning Rate Transfer Lorenzo Noci, Alexandru Meterez, Thomas Hofmann, Antonio Orvieto
NeurIPS 2024 Theoretical Foundations of Deep Selective State-Space Models Nicola Muca Cirone, Antonio Orvieto, Benjamin Walker, Cristopher Salvi, Terry Lyons
NeurIPS 2024 Understanding the Differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks Jerome Sieber, Carmen Amo Alonso, Alexandre Didier, Melanie N. Zeilinger, Antonio Orvieto
ICML 2024 Universality of Linear Recurrences Followed by Non-Linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues Antonio Orvieto, Soham De, Caglar Gulcehre, Razvan Pascanu, Samuel L Smith
CVPR 2023 Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning Sanghwan Kim, Lorenzo Noci, Antonio Orvieto, Thomas Hofmann
ICML 2023 An SDE for Modeling SAM: Theory and Insights Enea Monzio Compagnoni, Luca Biggio, Antonio Orvieto, Frank Norbert Proske, Hans Kersting, Aurelien Lucchi
NeurIPSW 2023 Escaping Random Teacher Initialization Enhances Signal Propagation and Representation Felix Sarnthein, Sidak Pal Singh, Antonio Orvieto, Thomas Hofmann
AISTATS 2023 Explicit Regularization in Overparametrized Models via Noise Injection Antonio Orvieto, Anant Raj, Hans Kersting, Francis Bach
ICML 2023 Resurrecting Recurrent Neural Networks for Long Sequences Antonio Orvieto, Samuel L Smith, Albert Gu, Anushan Fernando, Caglar Gulcehre, Razvan Pascanu, Soham De
AISTATS 2022 Faster Single-Loop Algorithms for Minimax Optimization Without Strong Concavity Junchi Yang, Antonio Orvieto, Aurelien Lucchi, Niao He
AISTATS 2022 Vanishing Curvature in Randomly Initialized Deep ReLU Networks Antonio Orvieto, Jonas Kohler, Dario Pavllo, Thomas Hofmann, Aurelien Lucchi
NeurIPSW 2022 Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning Sanghwan Kim, Lorenzo Noci, Antonio Orvieto, Thomas Hofmann
ICML 2022 Anticorrelated Noise Injection for Improved Generalization Antonio Orvieto, Hans Kersting, Frank Proske, Francis Bach, Aurelien Lucchi
NeurIPSW 2022 Batch Size Selection by Stochastic Optimal Control Jim Zhao, Aurelien Lucchi, Frank Norbert Proske, Antonio Orvieto, Hans Kersting
NeurIPS 2022 Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution Antonio Orvieto, Simon Lacoste-Julien, Nicolas Loizou
ICMLW 2022 Enhancing Unit-Tests for Invariance Discovery Piersilvio De Bartolomeis, Antonio Orvieto, Giambattista Parascandolo
NeurIPS 2022 On the Theoretical Properties of Noise Correlation in Stochastic Optimization Aurelien Lucchi, Frank Proske, Antonio Orvieto, Francis R. Bach, Hans Kersting
NeurIPS 2022 Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse Lorenzo Noci, Sotiris Anagnostidis, Luca Biggio, Antonio Orvieto, Sidak Pal Singh, Aurelien Lucchi
AISTATS 2021 Momentum Improves Optimization on Riemannian Manifolds Foivos Alimisis, Antonio Orvieto, Gary Becigneul, Aurelien Lucchi
AISTATS 2021 Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy S. Smith
NeurIPSW 2021 Empirics on the Expressiveness of Randomized Signature Enea Monzio Compagnoni, Luca Biggio, Antonio Orvieto
ICLR 2021 Learning Explanations That Are Hard to Vary Giambattista Parascandolo, Alexander Neitz, Antonio Orvieto, Luigi Gresele, Bernhard Schölkopf
NeurIPS 2021 On the Second-Order Convergence Properties of Random Search Methods Aurelien Lucchi, Antonio Orvieto, Adamos Solomou
NeurIPS 2021 Rethinking the Variational Interpretation of Accelerated Optimization Methods Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand
AISTATS 2020 A Continuous-Time Perspective for Modeling Acceleration in Riemannian Optimization Foivos Alimisis, Antonio Orvieto, Gary Becigneul, Aurelien Lucchi
ICML 2020 An Accelerated DFO Algorithm for Finite-Sum Convex Functions Yuwen Chen, Antonio Orvieto, Aurelien Lucchi
NeurIPS 2019 Continuous-Time Models for Stochastic Optimization Algorithms Antonio Orvieto, Aurelien Lucchi
NeurIPS 2019 Shadowing Properties of Optimization Algorithms Antonio Orvieto, Aurelien Lucchi
UAI 2019 The Role of Memory in Stochastic Optimization Antonio Orvieto, Jonas Kohler, Aurelien Lucchi