Ferret, Johan

12 publications

ICLR 2025 BOND: Aligning LLMs with Best-of-N Distillation Pier Giuseppe Sessa, Robert Dadashi-Tazehozi, Leonard Hussenot, Johan Ferret, Nino Vieillard, Alexandre Rame, Bobak Shahriari, Sarah Perrin, Abram L. Friesen, Geoffrey Cideron, Sertan Girgin, Piotr Stanczyk, Andrea Michi, Danila Sinopalnikov, Sabela Ramos Garea, Amélie Héliou, Aliaksei Severyn, Matthew Hoffman, Nikola Momchev, Olivier Bachem
ICLR 2025 Diversity-Rewarded CFG Distillation Geoffrey Cideron, Andrea Agostinelli, Johan Ferret, Sertan Girgin, Romuald Elie, Olivier Bachem, Sarah Perrin, Alexandre Rame
ICML 2025 On Teacher Hacking in Language Model Distillation Daniil Tiapkin, Daniele Calandriello, Johan Ferret, Sarah Perrin, Nino Vieillard, Alexandre Rame, Mathieu Blondel
TMLR 2024 A Survey of Temporal Credit Assignment in Deep Reinforcement Learning Eduardo Pignatelli, Johan Ferret, Matthieu Geist, Thomas Mesnard, Hado van Hasselt, Laura Toni
ICMLW 2024 Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL Eduardo Pignatelli, Johan Ferret, Davide Paglieri, Samuel Coward, Tim Rocktäschel, Edward Grefenstette, Laura Toni
NeurIPSW 2024 Conditional Language Policy: A General Framework for Steerable Multi-Objective Finetuning Kaiwen Wang, Rahul Kidambi, Ryan Sullivan, Alekh Agarwal, Christoph Dann, Andrea Michi, Marco Gelmi, Yunxuan Li, Raghav Gupta, Kumar Avinava Dubey, Alexandre Rame, Johan Ferret, Geoffrey Cideron, Le Hou, Hongkun Yu, Amr Ahmed, Aranyak Mehta, Leonard Hussenot, Olivier Bachem, Edouard Leurent
ICML 2024 RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback Harrison Lee, Samrat Phatale, Hassan Mansoor, Thomas Mesnard, Johan Ferret, Kellie Ren Lu, Colton Bishop, Ethan Hall, Victor Carbune, Abhinav Rastogi, Sushant Prakash
ICML 2024 WARM: On the Benefits of Weight Averaged Reward Models Alexandre Rame, Nino Vieillard, Leonard Hussenot, Robert Dadashi-Tazehozi, Geoffrey Cideron, Olivier Bachem, Johan Ferret
NeurIPSW 2022 Better State Exploration Using Action Sequence Equivalence Nathan Grinsztajn, Toby Johnstone, Johan Ferret, Philippe Preux
ICLR 2021 Adversarially Guided Actor-Critic Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist
NeurIPS 2021 There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist
IJCAI 2020 Self-Attentional Credit Assignment for Transfer in Reinforcement Learning Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin