Valko, Michal

114 publications

ICML 2025 The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback Côme Fiegel, Pierre Menard, Tadashi Kozuno, Michal Valko, Vianney Perchet

AISTATS 2024 A General Theoretical Paradigm to Understand Learning from Human Preferences Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Remi Munos, Mark Rowland, Michal Valko, Daniele Calandriello

ICML 2024 Decoding-Time Realignment of Language Models Tianlin Liu, Shangmin Guo, Leonardo Bianco, Daniele Calandriello, Quentin Berthet, Felipe Llinares-López, Jessica Hoffmann, Lucas Dixon, Michal Valko, Mathieu Blondel

ICLR 2024 Demonstration-Regularized RL Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Menard

ICML 2024 Generalized Preference Optimization: A Unified Approach to Offline Alignment Yunhao Tang, Zhaohan Daniel Guo, Zeyu Zheng, Daniele Calandriello, Remi Munos, Mark Rowland, Pierre Harvey Richemond, Michal Valko, Bernardo Avila Pires, Bilal Piot

ICML 2024 Human Alignment of Large Language Models Through Online Preference Optimisation Daniele Calandriello, Zhaohan Daniel Guo, Remi Munos, Mark Rowland, Yunhao Tang, Bernardo Avila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot

NeurIPS 2024 Local and Adaptive Mirror Descents in Extensive-Form Games Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko

NeurIPS 2024 Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora

ICMLW 2024 Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving Aniket Rajiv Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy P Lillicrap, Danilo Jimenez Rezende, Yoshua Bengio, Michael Curtis Mozer, Sanjeev Arora

ICML 2024 Nash Learning from Human Feedback Remi Munos, Michal Valko, Daniele Calandriello, Mohammad Gheshlaghi Azar, Mark Rowland, Zhaohan Daniel Guo, Yunhao Tang, Matthieu Geist, Thomas Mesnard, Côme Fiegel, Andrea Michi, Marco Selvi, Sertan Girgin, Nikola Momchev, Olivier Bachem, Daniel J Mankowitz, Doina Precup, Bilal Piot

ICLR 2024 Unlocking the Power of Representations in Long-Term Novelty-Based Exploration Alaa Saade, Steven Kapturowski, Daniele Calandriello, Charles Blundell, Pablo Sprechmann, Leopoldo Sarra, Oliver Groth, Michal Valko, Bilal Piot

ICML 2023 Adapting to Game Trees in Zero-Sum Imperfect Information Games Côme Fiegel, Pierre Menard, Tadashi Kozuno, Remi Munos, Vianney Perchet, Michal Valko

ICML 2023 Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Remi Munos, Michal Valko

ICML 2023 DoMo-AC: Doubly Multi-Step Off-Policy Actor-Critic Algorithm Yunhao Tang, Tadashi Kozuno, Mark Rowland, Anna Harutyunyan, Remi Munos, Bernardo Avila Pires, Michal Valko

ICML 2023 Fast Rates for Maximum Entropy Exploration Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Menard

ICML 2023 Half-Hop: A Graph Upsampling Approach for Slowing Down Message Passing Mehdi Azabou, Venkataramana Ganesh, Shantanu Thakoor, Chi-Heng Lin, Lakshmi Sathidevi, Ran Liu, Michal Valko, Petar Veličković, Eva L Dyer

NeurIPSW 2023 Middle-Mile Logistics Through the Lens of Goal-Conditioned Reinforcement Learning Onno Eberhard, Thibaut Cuvelier, Michal Valko, Bruno Adrien De Backer

NeurIPS 2023 Model-Free Posterior Sampling via Learning Rate Randomization Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Ménard

ICML 2023 Quantile Credit Assignment Thomas Mesnard, Wenqi Chen, Alaa Saade, Yunhao Tang, Mark Rowland, Theophane Weber, Clare Lyle, Audrunas Gruslys, Michal Valko, Will Dabney, Georg Ostrovski, Eric Moulines, Remi Munos

ICML 2023 Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice Toshinori Kitamura, Tadashi Kozuno, Yunhao Tang, Nino Vieillard, Michal Valko, Wenhao Yang, Jincheng Mei, Pierre Menard, Mohammad Gheshlaghi Azar, Remi Munos, Olivier Pietquin, Matthieu Geist, Csaba Szepesvari, Wataru Kumagai, Yutaka Matsuo

ICML 2023 Understanding Self-Predictive Learning for Reinforcement Learning Yunhao Tang, Zhaohan Daniel Guo, Pierre Harvey Richemond, Bernardo Avila Pires, Yash Chandak, Remi Munos, Mark Rowland, Mohammad Gheshlaghi Azar, Charline Le Lan, Clare Lyle, András György, Shantanu Thakoor, Will Dabney, Bilal Piot, Daniele Calandriello, Michal Valko

NeurIPSW 2023 Unlocking the Power of Representations in Long-Term Novelty-Based Exploration Steven Kapturowski, Alaa Saade, Daniele Calandriello, Charles Blundell, Pablo Sprechmann, Leopoldo Sarra, Oliver Groth, Michal Valko, Bilal Piot

ICML 2023 VA-Learning as a More Efficient Alternative to Q-Learning Yunhao Tang, Remi Munos, Mark Rowland, Michal Valko

AISTATS 2022 Adaptive Multi-Goal Exploration Jean Tarbouriech, Omar Darwiche Domingues, Pierre Menard, Matteo Pirotta, Michal Valko, Alessandro Lazaric

AISTATS 2022 Marginalized Operators for Off-Policy Reinforcement Learning Yunhao Tang, Mark Rowland, Remi Munos, Michal Valko

NeurIPS 2022 BYOL-Explore: Exploration by Bootstrapped Prediction Zhaohan Guo, Shantanu Thakoor, Miruna Pislar, Bernardo Avila Pires, Florent Altché, Corentin Tallec, Alaa Saade, Daniele Calandriello, Jean-Bastien Grill, Yunhao Tang, Michal Valko, Remi Munos, Mohammad Gheshlaghi Azar, Bilal Piot

NeurIPSW 2022 Curiosity in Hindsight Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Remi Munos, Michal Valko

ICML 2022 From Dirichlet to Rubin: Optimistic Exploration in RL Without Bonuses Daniil Tiapkin, Denis Belomestny, Eric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Menard

ICLR 2022 Large-Scale Representation Learning on Graphs via Bootstrapping Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Mehdi Azabou, Eva L Dyer, Remi Munos, Petar Veličković, Michal Valko

NeurIPS 2022 Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Ménard

ICML 2022 Retrieval-Augmented Reinforcement Learning Anirudh Goyal, Abram Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adrià Puigdomènech Badia, Arthur Guez, Mehdi Mirza, Peter C Humphreys, Ksenia Konyushova, Michal Valko, Simon Osindero, Timothy Lillicrap, Nicolas Heess, Charles Blundell

ICML 2022 Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco

AISTATS 2021 A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces Omar Darwiche Domingues, Pierre Menard, Matteo Pirotta, Emilie Kaufmann, Michal Valko

NeurIPS 2021 A Provably Efficient Sample Collection Strategy for Reinforcement Learning Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric

ALT 2021 Adaptive Reward-Free Exploration Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Edouard Leurent, Michal Valko

ICLRW 2021 Bootstrapped Representation Learning on Graphs Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Remi Munos, Petar Veličković, Michal Valko

ICCV 2021 Broaden Your Views for Self-Supervised Video Learning Adrià Recasens, Pauline Luc, Jean-Baptiste Alayrac, Luyu Wang, Florian Strub, Corentin Tallec, Mateusz Malinowski, Viorica Pătrăucean, Florent Altché, Michal Valko, Jean-Bastien Grill, Aäron van den Oord, Andrew Zisserman

ICMLW 2021 Density-Based Bonuses on Learned Representations for Reward-Free Exploration in Deep Reinforcement Learning Omar Darwiche Domingues, Corentin Tallec, Remi Munos, Michal Valko

NeurIPS 2021 Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity Ran Liu, Mehdi Azabou, Max Dabagia, Chi-Heng Lin, Mohammad Gheshlaghi Azar, Keith Hengen, Michal Valko, Eva Dyer

ALT 2021 Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited Omar Darwiche Domingues, Pierre Ménard, Emilie Kaufmann, Michal Valko

ICML 2021 Fast Active Learning for Pure Exploration in Reinforcement Learning Pierre Menard, Omar Darwiche Domingues, Anders Jonsson, Emilie Kaufmann, Edouard Leurent, Michal Valko

JAIR 2021 Game Plan: What AI Can Do for Football, and What Football Can Do for AI Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome T. Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adrià Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Pérolat, Bart De Vylder, S. M. Ali Eslami, Mark Rowland, Andrew Jaegle, Rémi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis

ICML 2021 Kernel-Based Reinforcement Learning: A Finite-Time Analysis Omar Darwiche Domingues, Pierre Menard, Matteo Pirotta, Emilie Kaufmann, Michal Valko

NeurIPS 2021 Learning in Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall Tadashi Kozuno, Pierre Ménard, Remi Munos, Michal Valko

ICML 2021 Online A-Optimal Design and Active Linear Regression Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet

ICML 2021 Revisiting Peng’s Q($λ$) for Modern Reinforcement Learning Tadashi Kozuno, Yunhao Tang, Mark Rowland, Remi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel

ALT 2021 Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric

NeurIPS 2021 Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret Jean Tarbouriech, Runlong Zhou, Simon S Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric

ICML 2021 Taylor Expansion of Discount Factors Yunhao Tang, Mark Rowland, Remi Munos, Michal Valko

ICML 2021 UCB Momentum Q-Learning: Correcting the Bias Without Forgetting Pierre Menard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko

NeurIPS 2021 Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation Yunhao Tang, Tadashi Kozuno, Mark Rowland, Remi Munos, Michal Valko

AISTATS 2020 A Single Algorithm for Both Restless and Rested Rotting Bandits Julien Seznec, Pierre Menard, Alessandro Lazaric, Michal Valko

AISTATS 2020 Adaptive Multi-Fidelity Optimization with Fast Learning Rates Côme Fiegel, Victor Gabillon, Michal Valko

NeurIPS 2020 Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Remi Munos, Michal Valko

ICML 2020 Budgeted Online Influence Maximization Pierre Perrault, Jennifer Healey, Zheng Wen, Michal Valko

COLT 2020 Covariance-Adapting Algorithm for Semi-Bandits with Application to Sparse Outcomes Pierre Perrault, Michal Valko, Vianney Perchet

AISTATS 2020 Derivative-Free & Order-Robust Optimisation Haitham Ammar, Victor Gabillon, Rasul Tutunov, Michal Valko

AISTATS 2020 Fixed-Confidence Guarantees for Bayesian Best-Arm Identification Xuedong Shang, Rianne Heide, Pierre Menard, Emilie Kaufmann, Michal Valko

ICML 2020 Gamification of Pure Exploration for Linear Bandits Rémy Degenne, Pierre Menard, Xuedong Shang, Michal Valko

NeurIPS 2020 Improved Sample Complexity for Incremental Autonomous Exploration in MDPs Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric

ICML 2020 Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards Aadirupa Saha, Pierre Gaillard, Michal Valko

ICML 2020 Monte-Carlo Tree Search as Regularized Policy Optimization Jean-Bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Remi Munos

ICML 2020 Near-Linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco

ICML 2020 No-Regret Exploration in Goal-Oriented Reinforcement Learning Jean Tarbouriech, Evrard Garcelon, Michal Valko, Matteo Pirotta, Alessandro Lazaric

NeurIPS 2020 Planning in Markov Decision Processes with Gap-Dependent Sample Complexity Anders Jonsson, Emilie Kaufmann, Pierre Menard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko

NeurIPS 2020 Sampling from a K-DPP Without Looking at All Items Daniele Calandriello, Michal Derezinski, Michal Valko

JMLR 2020 Spectral Bandits Tomáš Kocák, Rémi Munos, Branislav Kveton, Shipra Agrawal, Michal Valko

NeurIPS 2020 Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits Pierre Perrault, Etienne Boursier, Michal Valko, Vianney Perchet

ICML 2020 Stochastic Bandits with Arm-Dependent Delays Manegueu Anne Gael, Claire Vernade, Alexandra Carpentier, Michal Valko

ICML 2020 Taylor Expansion Policy Optimization Yunhao Tang, Michal Valko, Remi Munos

ALT 2019 A Simple Parameter-Free and Adaptive Approach to Optimization Under a Minimal Local Smoothness Assumption Peter L. Bartlett, Victor Gabillon, Michal Valko

AISTATS 2019 Active Multiple Matrix Completion with Adaptive Confidence Sets Andrea Locatelli, Alexandra Carpentier, Michal Valko

JMLR 2019 DPPy: DPP Sampling with Python Guillaume Gautier, Guillermo Polito, Rémi Bardenet, Michal Valko

NeurIPS 2019 Exact Sampling of Determinantal Point Processes with Sublinear Time Preprocessing Michal Derezinski, Daniele Calandriello, Michal Valko

ICML 2019 Exploiting Structure of Uncertainty for Efficient Matroid Semi-Bandits Pierre Perrault, Vianney Perchet, Michal Valko

AISTATS 2019 Finding the Bandit in a Graph: Sequential Search-and-Stop Pierre Perrault, Vianney Perchet, Michal Valko

COLT 2019 Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, Lorenzo Rosasco

ALT 2019 General Parallel Optimization a Without Metric Xuedong Shang, Emilie Kaufmann, Michal Valko

NeurIPS 2019 Multiagent Evaluation Under Incomplete Information Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Perolat, Michal Valko, Georgios Piliouras, Remi Munos

NeurIPS 2019 On Two Ways to Use Determinantal Point Processes for Monte Carlo Integration Guillaume Gautier, Rémi Bardenet, Michal Valko

NeurIPS 2019 Planning in Entropy-Regularized Markov Decision Processes and Games Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Menard, Remi Munos, Michal Valko

AISTATS 2019 Rotting Bandits Are No Harder than Stochastic Ones Julien Seznec, Andrea Locatelli, Alexandra Carpentier, Alessandro Lazaric, Michal Valko

ICML 2019 Scale-Free Adaptive Planning for Deterministic Dynamics & Discounted Rewards Peter Bartlett, Victor Gabillon, Jennifer Healey, Michal Valko

COLT 2018 Best of Both Worlds: Stochastic & Adversarial Best-Arm Identification Yasin Abbasi-Yadkori, Peter L. Bartlett, Victor Gabillon, Alan Malek, Michal Valko

ECCV 2018 Compressing the Input for CNNs with the First-Order Scattering Transform Edouard Oyallon, Eugene Belilovsky, Sergey Zagoruyko, Michal Valko

ICML 2018 Improved Large-Scale Graph Learning Through Ridge Spectral Sparsification Daniele Calandriello, Alessandro Lazaric, Ioannis Koutis, Michal Valko

NeurIPS 2018 Optimistic Optimization of a Brownian Jean-Bastien Grill, Michal Valko, Remi Munos

AISTATS 2017 Distributed Adaptive Sampling for Kernel Matrix Approximation Daniele Calandriello, Alessandro Lazaric, Michal Valko

NeurIPS 2017 Efficient Second-Order Online Kernel Learning with Adaptive Embedding Daniele Calandriello, Alessandro Lazaric, Michal Valko

NeurIPS 2017 Online Influence Maximization Under Independent Cascade Model with Semi-Bandit Feedback Zheng Wen, Branislav Kveton, Michal Valko, Sharan Vaswani

ICML 2017 Second-Order Kernel Online Convex Optimization with Adaptive Sketching Daniele Calandriello, Alessandro Lazaric, Michal Valko

AISTATS 2017 Trading Off Rewards and Errors in Multi-Armed Bandits Akram Erraqabi, Alessandro Lazaric, Michal Valko, Emma Brunskill, Yun-En Liu

ICML 2017 Zonotope Hit-and-Run for Efficient Sampling from Projection DPPs Guillaume Gautier, Rémi Bardenet, Michal Valko

UAI 2016 Analysis of Nyström Method with Sequential Ridge Leverage Scores Daniele Calandriello, Alessandro Lazaric, Michal Valko

JMLR 2016 Bayesian Policy Gradient and Actor-Critic Algorithms Mohammad Ghavamzadeh, Yaakov Engel, Michal Valko

NeurIPS 2016 Blazing the Trails Before Beating the Path: Sample-Efficient Monte-Carlo Planning Jean-Bastien Grill, Michal Valko, Remi Munos

UAI 2016 Online Learning with Erdos-Renyi Side-Observation Graphs Tomás Kocák, Gergely Neu, Michal Valko

AISTATS 2016 Online Learning with Noisy Side Observations Tomás Kocák, Gergely Neu, Michal Valko

ICML 2016 Pliable Rejection Sampling Akram Erraqabi, Michal Valko, Alexandra Carpentier, Odalric Maillard

AISTATS 2016 Revealing Graph Bandits for Maximizing Local Influence Alexandra Carpentier, Michal Valko

NeurIPS 2015 Black-Box Optimization of Noisy Functions with Unknown Smoothness Jean-Bastien Grill, Michal Valko, Remi Munos, Remi Munos

ICML 2015 Cheap Bandits Manjesh Hanawal, Venkatesh Saligrama, Michal Valko, Remi Munos

IJCAI 2015 Maximum Entropy Semi-Supervised Inverse Reinforcement Learning Julien Audiffren, Michal Valko, Alessandro Lazaric, Mohammad Ghavamzadeh

ICML 2015 Simple Regret for Infinitely Many Armed Bandits Alexandra Carpentier, Michal Valko

NeurIPS 2014 Efficient Learning by Implicit Exploration in Bandit Problems with Side Observations Tomáš Kocák, Gergely Neu, Michal Valko, Remi Munos

NeurIPS 2014 Extreme Bandits Alexandra Carpentier, Michal Valko

NeurIPS 2014 Online Combinatorial Optimization with Stochastic Decision Sets and Adversarial Losses Gergely Neu, Michal Valko

ICML 2014 Spectral Bandits for Smooth Graph Functions Michal Valko, Remi Munos, Branislav Kveton, Tomáš Kocák

AAAI 2014 Spectral Thompson Sampling Tomás Kocák, Michal Valko, Rémi Munos, Shipra Agrawal

UAI 2013 Finite-Time Analysis of Kernelised Contextual Bandits Michal Valko, Nathaniel Korda, Rémi Munos, Ilias N. Flaounas, Nello Cristianini

ICML 2013 Stochastic Simultaneous Optimistic Optimization Michal Valko, Alexandra Carpentier, Rémi Munos

UAI 2010 Online Semi-Supervised Learning on Quantized Graphs Michal Valko, Branislav Kveton, Ling Huang, Daniel Ting

CVPRW 2010 Online Semi-Supervised Perception: Real-Time Learning Without Explicit Feedback Branislav Kveton, Matthai Philipose, Michal Valko, Ling Huang

AISTATS 2010 Semi-Supervised Learning with Max-Margin Graph Cuts Branislav Kveton, Michal Valko, Ali Rahimi, Ling Huang