ML Anthology
Authors
Search
About
Hosseini, Arian
12 publications
ICLR
2025
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch
,
Shengyi Huang
,
Sophie Xhonneux
,
Arian Hosseini
,
Rishabh Agarwal
,
Aaron Courville
ICLR
2025
Generative Verifiers: Reward Modeling as Next-Token Prediction
Lunjun Zhang
,
Arian Hosseini
,
Hritik Bansal
,
Mehran Kazemi
,
Aviral Kumar
,
Rishabh Agarwal
ICLR
2025
Smaller, Weaker, yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Hritik Bansal
,
Arian Hosseini
,
Rishabh Agarwal
,
Vinh Q. Tran
,
Mehran Kazemi
NeurIPSW
2024
Faster, More Efficient RLHF Through Off-Policy Asynchronous Learning
Michael Noukhovitch
,
Shengyi Huang
,
Sophie Xhonneux
,
Arian Hosseini
,
Rishabh Agarwal
,
Aaron Courville
NeurIPSW
2024
Generative Verifiers: Reward Modeling as Next-Token Prediction
Lunjun Zhang
,
Arian Hosseini
,
Hritik Bansal
,
Mehran Kazemi
,
Aviral Kumar
,
Rishabh Agarwal
NeurIPSW
2024
Generative Verifiers: Reward Modeling as Next-Token Prediction
Lunjun Zhang
,
Arian Hosseini
,
Hritik Bansal
,
Mehran Kazemi
,
Aviral Kumar
,
Rishabh Agarwal
NeurIPSW
2024
Not All LLM Reasoners Are Created Equal
Arian Hosseini
,
Alessandro Sordoni
,
Daniel Kenji Toyama
,
Aaron Courville
,
Rishabh Agarwal
NeurIPSW
2024
Not All LLM Reasoners Are Created Equal
Arian Hosseini
,
Alessandro Sordoni
,
Daniel Kenji Toyama
,
Aaron Courville
,
Rishabh Agarwal
NeurIPSW
2024
Smaller, Weaker, yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Hritik Bansal
,
Arian Hosseini
,
Rishabh Agarwal
,
Vinh Q. Tran
,
Mehran Kazemi
NeurIPS
2023
Joint Prompt Optimization of Stacked LLMs Using Variational Inference
Alessandro Sordoni
,
Eric Yuan
,
Marc-Alexandre Côté
,
Matheus Pereira
,
Adam Trischler
,
Ziang Xiao
,
Arian Hosseini
,
Friederike Niedtner
,
Nicolas Le Roux
ICLR
2019
Learning to Understand Goal Specifications by Modelling Reward
Dzmitry Bahdanau
,
Felix Hill
,
Jan Leike
,
Edward Hughes
,
Arian Hosseini
,
Pushmeet Kohli
,
Edward Grefenstette
NeurIPS
2019
Ordered Memory
Yikang Shen
,
Shawn Tan
,
Arian Hosseini
,
Zhouhan Lin
,
Alessandro Sordoni
,
Aaron C. Courville