ML Anthology
Authors
Search
About
Bacon, Pierre-Luc
42 publications
ICLR
2025
MaestroMotif: Skill Design from Artificial Intelligence Feedback
Martin Klissarov
,
Mikael Henaff
,
Roberta Raileanu
,
Shagun Sodhani
,
Pascal Vincent
,
Amy Zhang
,
Pierre-Luc Bacon
,
Doina Precup
,
Marlos C. Machado
,
Pierluca D'Oro
TMLR
2025
Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Simon Dufort-Labbé
,
Pierluca D'Oro
,
Evgenii Nikishin
,
Irina Rish
,
Pierre-Luc Bacon
,
Razvan Pascanu
,
Aristide Baratin
ICLRW
2025
Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Diego Calanzone
,
Pierluca D'Oro
,
Pierre-Luc Bacon
ICML
2025
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning
Guozheng Ma
,
Lu Li
,
Zilin Wang
,
Li Shen
,
Pierre-Luc Bacon
,
Dacheng Tao
ICML
2025
Scaling Trends in Language Model Robustness
Nikolaus H. R. Howe
,
Ian R. Mckenzie
,
Oskar John Hollinsworth
,
Michał Zając
,
Tom Tseng
,
Aaron David Tucker
,
Pierre-Luc Bacon
,
Adam Gleave
NeurIPS
2025
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Roger Creus Castanyer
,
Johan Obando-Ceron
,
Lu Li
,
Pierre-Luc Bacon
,
Glen Berseth
,
Aaron Courville
,
Pablo Samuel Castro
NeurIPS
2025
State Entropy Regularization for Robust Reinforcement Learning
Yonatan Ashlag
,
Uri Koren
,
Mirco Mutti
,
Esther Derman
,
Pierre-Luc Bacon
,
Shie Mannor
ICLR
2024
Bridging State and History Representations: Understanding Self-Predictive RL
Tianwei Ni
,
Benjamin Eysenbach
,
Erfan SeyedSalehi
,
Michel Ma
,
Clement Gehring
,
Aditya Mahajan
,
Pierre-Luc Bacon
ICLR
2024
Course Correcting Koopman Representations
Mahan Fathi
,
Clement Gehring
,
Jonathan Pilault
,
David Kanaa
,
Pierre-Luc Bacon
,
Ross Goroshin
ICLR
2024
Decoupling Regularization from the Action Space
Sobhan Mohammadpour
,
Emma Frejinger
,
Pierre-Luc Bacon
ICML
2024
Do Transformer World Models Give Better Policy Gradients?
Michel Ma
,
Tianwei Ni
,
Clement Gehring
,
Pierluca D’Oro
,
Pierre-Luc Bacon
ICMLW
2024
Exploring Scaling Trends in LLM Robustness
Nikolaus H. R. Howe
,
Michał Zając
,
Ian R. McKenzie
,
Oskar John Hollinsworth
,
Pierre-Luc Bacon
,
Adam Gleave
AISTATS
2024
Maximum Entropy GFlowNets with Soft Q-Learning
Sobhan Mohammadpour
,
Emmanuel Bengio
,
Emma Frejinger
,
Pierre-Luc Bacon
ICLR
2024
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov
,
Pierluca D'Oro
,
Shagun Sodhani
,
Roberta Raileanu
,
Pierre-Luc Bacon
,
Pascal Vincent
,
Amy Zhang
,
Mikael Henaff
NeurIPS
2023
Block-State Transformers
Jonathan Pilault
,
Mahan Fathi
,
Orhan Firat
,
Chris Pal
,
Pierre-Luc Bacon
,
Ross Goroshin
NeurIPS
2023
Double Gumbel Q-Learning
David Yu-Tung Hui
,
Aaron C. Courville
,
Pierre-Luc Bacon
ICMLW
2023
Goal-Conditioned GFlowNets for Controllable Multi-Objective Molecular Design
Julien Roy
,
Pierre-Luc Bacon
,
Christopher Pal
,
Emmanuel Bengio
NeurIPSW
2023
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov
,
Pierluca D'Oro
,
Shagun Sodhani
,
Roberta Raileanu
,
Pierre-Luc Bacon
,
Pascal Vincent
,
Amy Zhang
,
Mikael Henaff
NeurIPSW
2023
Motif: Intrinsic Motivation from Artificial Intelligence Feedback
Martin Klissarov
,
Pierluca D'Oro
,
Shagun Sodhani
,
Roberta Raileanu
,
Pierre-Luc Bacon
,
Pascal Vincent
,
Amy Zhang
,
Mikael Henaff
NeurIPS
2023
Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Nate Rahn
,
Pierluca D'Oro
,
Harley Wiltzer
,
Pierre-Luc Bacon
,
Marc Bellemare
ICLR
2023
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Pierluca D'Oro
,
Max Schwarzer
,
Evgenii Nikishin
,
Pierre-Luc Bacon
,
Marc G Bellemare
,
Aaron Courville
NeurIPS
2023
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Tianwei Ni
,
Michel Ma
,
Benjamin Eysenbach
,
Pierre-Luc Bacon
NeurIPSW
2023
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Tianwei Ni
,
Michel Ma
,
Benjamin Eysenbach
,
Pierre-Luc Bacon
ICLR
2022
Continuous-Time Meta-Learning with Forward Mode Differentiation
Tristan Deleu
,
David Kanaa
,
Leo Feng
,
Giancarlo Kerg
,
Yoshua Bengio
,
Guillaume Lajoie
,
Pierre-Luc Bacon
AAAI
2022
Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation
Evgenii Nikishin
,
Romina Abachi
,
Rishabh Agarwal
,
Pierre-Luc Bacon
ICML
2022
Direct Behavior Specification via Constrained Reinforcement Learning
Julien Roy
,
Roger Girgis
,
Joshua Romoff
,
Pierre-Luc Bacon
,
Chris J Pal
NeurIPS
2022
Myriad: A Real-World Testbed to Bridge Trajectory Optimization and Deep Learning
Nikolaus Howe
,
Simon Dufort-Labbé
,
Nitarshan Rajkumar
,
Pierre-Luc Bacon
NeurIPSW
2022
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Pierluca D'Oro
,
Max Schwarzer
,
Evgenii Nikishin
,
Pierre-Luc Bacon
,
Marc G Bellemare
,
Aaron Courville
ICML
2022
The Primacy Bias in Deep Reinforcement Learning
Evgenii Nikishin
,
Max Schwarzer
,
Pierluca D’Oro
,
Pierre-Luc Bacon
,
Aaron Courville
ICMLW
2021
Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation
Evgenii Nikishin
,
Romina Abachi
,
Rishabh Agarwal
,
Pierre-Luc Bacon
NeurIPSW
2021
Long-Term Credit Assignment via Model-Based Temporal Shortcuts
Michel Ma
,
Pierluca D'Oro
,
Yoshua Bengio
,
Pierre-Luc Bacon
NeurIPS
2021
Neural Algorithmic Reasoners Are Implicit Planners
Andreea-Ioana Deac
,
Petar Veličković
,
Ognjen Milinkovic
,
Pierre-Luc Bacon
,
Jian Tang
,
Mladen Nikolic
AAAI
2020
Options of Interest: Temporal Abstraction with Interest Functions
Khimya Khetarpal
,
Martin Klissarov
,
Maxime Chevalier-Boisvert
,
Pierre-Luc Bacon
,
Doina Precup
ICML
2020
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Yao Liu
,
Pierre-Luc Bacon
,
Emma Brunskill
NeurIPSW
2020
XLVIN: eXecuted Latent Value Iteration Nets
Andreea Deac
,
Petar Veličković
,
Ognjen Milinković
,
Pierre-Luc Bacon
,
Jian Tang
,
Mladen Nikolić
ICML
2018
Convergent Tree Backup and Retrace with Function Approximation
Ahmed Touati
,
Pierre-Luc Bacon
,
Doina Precup
,
Pascal Vincent
AAAI
2018
Learning Robust Options
Daniel J. Mankowitz
,
Timothy A. Mann
,
Pierre-Luc Bacon
,
Doina Precup
,
Shie Mannor
AAAI
2018
Learning with Options That Terminate Off-Policy
Anna Harutyunyan
,
Peter Vrancx
,
Pierre-Luc Bacon
,
Doina Precup
,
Ann Nowé
AAAI
2018
OptionGAN: Learning Joint Reward-Policy Options Using Generative Adversarial Inverse Reinforcement Learning
Peter Henderson
,
Wei-Di Chang
,
Pierre-Luc Bacon
,
David Meger
,
Joelle Pineau
,
Doina Precup
AAAI
2018
When Waiting Is Not an Option: Learning Options with a Deliberation Cost
Jean Harb
,
Pierre-Luc Bacon
,
Martin Klissarov
,
Doina Precup
AAAI
2017
The Option-Critic Architecture
Pierre-Luc Bacon
,
Jean Harb
,
Doina Precup
UAI
2015
Learning and Planning with Timing Information in Markov Decision Processes
Pierre-Luc Bacon
,
Borja Balle
,
Doina Precup