Setlur, Amrith
30 publications
ICMLW
2024
Learning to Reason by Failing: Offline RL on Sub-Optimal Rollouts Scales Synthetic Data by 8x
NeurIPS
2024
On the Benefits of Public Representations for Private Transfer Learning Under Distribution Shift
NeurIPS
2024
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold