Papini, Matteo

29 publications

ICML 2025 Convergence Analysis of Policy Gradient Methods with Dynamic Stochasticity Alessandro Montenegro, Marco Mussi, Matteo Papini, Alberto Maria Metelli
MLJ 2025 Search or Split: Policy Gradient with Adaptive Policy Space Gianmarco Tedeschi, Matteo Papini, Alberto Maria Metelli, Marcello Restelli
ALT 2024 Importance-Weighted Offline Learning Done Right Germano Gabbianelli, Gergely Neu, Matteo Papini
NeurIPS 2024 Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning Alessandro Montenegro, Marco Mussi, Matteo Papini, Alberto Maria Metelli
ICML 2024 Learning Optimal Deterministic Policies with Stochastic Policy Gradients Alessandro Montenegro, Marco Mussi, Alberto Maria Metelli, Matteo Papini
NeurIPS 2024 Local Linearity: The Key for No-Regret Reinforcement Learning in Continuous MDPs Davide Maran, Alberto Maria Metelli, Matteo Papini, Marcello Restelli
ICML 2024 No-Regret Reinforcement Learning in Smooth MDPs Davide Maran, Alberto Maria Metelli, Matteo Papini, Marcello Restelli
AISTATS 2024 Offline Primal-Dual Reinforcement Learning for Linear MDPs Germano Gabbianelli, Gergely Neu, Matteo Papini, Nneka M Okolo
IJCAI 2024 Online Learning with Off-Policy Feedback in Adversarial MDPs Francesco Bacchiocchi, Francesco Emanuele Stradi, Matteo Papini, Alberto Maria Metelli, Nicola Gatti
COLT 2024 Optimistic Information Directed Sampling Gergely Neu, Matteo Papini, Ludovic Schwartz
ICMLW 2024 Optimistic Information Directed Sampling Gergely Neu, Matteo Papini, Ludovic Schwartz
ICMLW 2024 Policy Gradient Methods with Adaptive Policy Spaces Gianmarco Tedeschi, Matteo Papini, Marcello Restelli
COLT 2024 Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs Davide Maran, Alberto Maria Metelli, Matteo Papini, Marcello Restelli
MLJ 2024 Sample Complexity of Variance-Reduced Policy Gradient: Weaker Assumptions and Lower Bounds Gabor Paczolay, Matteo Papini, Alberto Maria Metelli, István Á. Harmati, Marcello Restelli
ALT 2023 Online Learning with Off-Policy Feedback Germano Gabbianelli, Gergely Neu, Matteo Papini
NeurIPS 2022 Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits Gergely Neu, Iuliia Olkhovskaia, Matteo Papini, Ludovic Schwartz
NeurIPS 2022 Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees Andrea Tirinzoni, Matteo Papini, Ahmed Touati, Alessandro Lazaric, Matteo Pirotta
MLJ 2022 Smoothing Policies and Safe Policy Gradients Matteo Papini, Matteo Pirotta, Marcello Restelli
ICML 2021 Leveraging Good Representations in Linear Contextual Bandits Matteo Papini, Andrea Tirinzoni, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta
AAAI 2021 Policy Optimization as Online Learning with Mediator Feedback Alberto Maria Metelli, Matteo Papini, Pierluca D'Oro, Marcello Restelli
NeurIPS 2021 Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano, Marcello Restelli, Alessandro Lazaric, Matteo Pirotta
AISTATS 2020 Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration Matteo Papini, Andrea Battistello, Marcello Restelli
AAAI 2020 Gradient-Aware Model-Based Policy Search Pierluca D'Oro, Alberto Maria Metelli, Andrea Tirinzoni, Matteo Papini, Marcello Restelli
JMLR 2020 Importance Sampling Techniques for Policy Optimization Alberto Maria Metelli, Matteo Papini, Nico Montali, Marcello Restelli
IJCAI 2020 Risk-Averse Trust Region Optimization for Reward-Volatility Reduction Lorenzo Bisi, Luca Sabbioni, Edoardo Vittori, Matteo Papini, Marcello Restelli
ICML 2019 Optimistic Policy Optimization via Multiple Importance Sampling Matteo Papini, Alberto Maria Metelli, Lorenzo Lupo, Marcello Restelli
NeurIPS 2018 Policy Optimization via Importance Sampling Alberto Maria Metelli, Matteo Papini, Francesco Faccio, Marcello Restelli
ICML 2018 Stochastic Variance-Reduced Policy Gradient Matteo Papini, Damiano Binaghi, Giuseppe Canonaco, Matteo Pirotta, Marcello Restelli
NeurIPS 2017 Adaptive Batch Size for Safe Policy Gradients Matteo Papini, Matteo Pirotta, Marcello Restelli