Biddulph, Caleb

3 publications

ICML 2025 MONA: Myopic Optimization with Non-Myopic Approval Can Mitigate Multi-Step Reward Hacking Sebastian Farquhar, Vikrant Varma, David Lindner, David Elson, Caleb Biddulph, Ian Goodfellow, Rohin Shah
UAI 2023 Bandits with Costly Reward Observations Aaron D. Tucker, Caleb Biddulph, Claire Wang, Thorsten Joachims
NeurIPSW 2022 Bandits with Costly Reward Observations Aaron David Tucker, Caleb Biddulph, Claire Wang, Thorsten Joachims