Orvieto, Antonio

46 publications

ICLR 2025 Adaptive Methods Through the Lens of SDEs: Theoretical Insights on the Role of Noise Enea Monzio Compagnoni, Tianlin Liu, Rustem Islamov, Frank Norbert Proske, Antonio Orvieto, Aurelien Lucchi

COLT 2025 An Uncertainty Principle for Linear Recurrent Neural Networks Alexandre François, Antonio Orvieto, Francis Bach

ICLRW 2025 Can You Finetune Your Binoculars? Embedding Text Watermarks into the Weights of Large Language Models Fay Elhassan, Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping

NeurIPS 2025 Enhancing Optimizer Stability: Momentum Adaptation of the NGN Step-Size Rustem Islamov, Niccolò Ajroldi, Antonio Orvieto, Aurelien Lucchi

ICLRW 2025 Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations Sajad Movahedi, Felix Sarnthein, Nicola Muca Cirone, Antonio Orvieto

NeurIPS 2025 Fixed-Point RNNs: Interpolating from Diagonal to Dense Sajad Movahedi, Felix Sarnthein, Nicola Muca Cirone, Antonio Orvieto

ICML 2025 Generalized Interpolating Discrete Diffusion Dimitri Von Rütte, Janis Fluri, Yuhui Ding, Antonio Orvieto, Bernhard Schölkopf, Thomas Hofmann

NeurIPS 2025 Generalized Linear Mode Connectivity for Transformers Alexander Theus, Alessandro Cabodi, Sotiris Anagnostidis, Antonio Orvieto, Sidak Pal Singh, Valentina Boeva

ICLR 2025 Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture Sajad Movahedi, Antonio Orvieto, Seyed-Mohsen Moosavi-Dezfooli

ICLRW 2025 Hyper-Align: Efficient Modality Alignment via Hypernetworks Jaisidh Singh, Diganta Misra, Boris Knyazev, Antonio Orvieto

NeurIPS 2025 In Search of Adam’s Secret Sauce Antonio Orvieto, Robert M. Gower

ICLRW 2025 Revisiting Associative Recall in Modern Recurrent Models Destiny Okpekpe, Antonio Orvieto

ICML 2025 When, Where and Why to Average Weights? Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping

NeurIPS 2024 Loss Landscape Characterization of Neural Networks Without Over-Parametrization Rustem Islamov, Niccoló Ajroldi, Antonio Orvieto, Aurelien Lucchi

ICML 2024 Recurrent Distance Filtering for Graph Representation Learning Yuhui Ding, Antonio Orvieto, Bobby He, Thomas Hofmann

NeurIPS 2024 Recurrent Neural Networks: Vanishing and Exploding Gradients Are Not the End of the Story Nicolas Zucchet, Antonio Orvieto

AISTATS 2024 SDEs for Minimax Optimization Enea Monzio Compagnoni, Antonio Orvieto, Hans Kersting, Frank Proske, Aurelien Lucchi

NeurIPS 2024 Super Consistency of Neural Network Landscapes and Learning Rate Transfer Lorenzo Noci, Alexandru Meterez, Thomas Hofmann, Antonio Orvieto

NeurIPS 2024 Theoretical Foundations of Deep Selective State-Space Models Nicola Muca Cirone, Antonio Orvieto, Benjamin Walker, Cristopher Salvi, Terry Lyons

NeurIPS 2024 Understanding the Differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks Jerome Sieber, Carmen Amo Alonso, Alexandre Didier, Melanie N. Zeilinger, Antonio Orvieto

ICML 2024 Universality of Linear Recurrences Followed by Non-Linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues Antonio Orvieto, Soham De, Caglar Gulcehre, Razvan Pascanu, Samuel L Smith

CVPR 2023 Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning Sanghwan Kim, Lorenzo Noci, Antonio Orvieto, Thomas Hofmann

ICML 2023 An SDE for Modeling SAM: Theory and Insights Enea Monzio Compagnoni, Luca Biggio, Antonio Orvieto, Frank Norbert Proske, Hans Kersting, Aurelien Lucchi

NeurIPSW 2023 Escaping Random Teacher Initialization Enhances Signal Propagation and Representation Felix Sarnthein, Sidak Pal Singh, Antonio Orvieto, Thomas Hofmann

AISTATS 2023 Explicit Regularization in Overparametrized Models via Noise Injection Antonio Orvieto, Anant Raj, Hans Kersting, Francis Bach

ICML 2023 Resurrecting Recurrent Neural Networks for Long Sequences Antonio Orvieto, Samuel L Smith, Albert Gu, Anushan Fernando, Caglar Gulcehre, Razvan Pascanu, Soham De

AISTATS 2022 Faster Single-Loop Algorithms for Minimax Optimization Without Strong Concavity Junchi Yang, Antonio Orvieto, Aurelien Lucchi, Niao He

AISTATS 2022 Vanishing Curvature in Randomly Initialized Deep ReLU Networks Antonio Orvieto, Jonas Kohler, Dario Pavllo, Thomas Hofmann, Aurelien Lucchi

NeurIPSW 2022 Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning Sanghwan Kim, Lorenzo Noci, Antonio Orvieto, Thomas Hofmann

ICML 2022 Anticorrelated Noise Injection for Improved Generalization Antonio Orvieto, Hans Kersting, Frank Proske, Francis Bach, Aurelien Lucchi

NeurIPSW 2022 Batch Size Selection by Stochastic Optimal Control Jim Zhao, Aurelien Lucchi, Frank Norbert Proske, Antonio Orvieto, Hans Kersting

NeurIPS 2022 Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution Antonio Orvieto, Simon Lacoste-Julien, Nicolas Loizou

ICMLW 2022 Enhancing Unit-Tests for Invariance Discovery Piersilvio De Bartolomeis, Antonio Orvieto, Giambattista Parascandolo

NeurIPS 2022 On the Theoretical Properties of Noise Correlation in Stochastic Optimization Aurelien Lucchi, Frank Proske, Antonio Orvieto, Francis R. Bach, Hans Kersting

NeurIPS 2022 Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse Lorenzo Noci, Sotiris Anagnostidis, Luca Biggio, Antonio Orvieto, Sidak Pal Singh, Aurelien Lucchi

AISTATS 2021 Momentum Improves Optimization on Riemannian Manifolds Foivos Alimisis, Antonio Orvieto, Gary Becigneul, Aurelien Lucchi

AISTATS 2021 Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand, Thomas Hofmann, Roy S. Smith

NeurIPSW 2021 Empirics on the Expressiveness of Randomized Signature Enea Monzio Compagnoni, Luca Biggio, Antonio Orvieto

ICLR 2021 Learning Explanations That Are Hard to Vary Giambattista Parascandolo, Alexander Neitz, Antonio Orvieto, Luigi Gresele, Bernhard Schölkopf

NeurIPS 2021 On the Second-Order Convergence Properties of Random Search Methods Aurelien Lucchi, Antonio Orvieto, Adamos Solomou

NeurIPS 2021 Rethinking the Variational Interpretation of Accelerated Optimization Methods Peiyuan Zhang, Antonio Orvieto, Hadi Daneshmand

AISTATS 2020 A Continuous-Time Perspective for Modeling Acceleration in Riemannian Optimization Foivos Alimisis, Antonio Orvieto, Gary Becigneul, Aurelien Lucchi

ICML 2020 An Accelerated DFO Algorithm for Finite-Sum Convex Functions Yuwen Chen, Antonio Orvieto, Aurelien Lucchi

NeurIPS 2019 Continuous-Time Models for Stochastic Optimization Algorithms Antonio Orvieto, Aurelien Lucchi

NeurIPS 2019 Shadowing Properties of Optimization Algorithms Antonio Orvieto, Aurelien Lucchi

UAI 2019 The Role of Memory in Stochastic Optimization Antonio Orvieto, Jonas Kohler, Aurelien Lucchi