Pitis, Silviu

25 publications

NeurIPS 2025 Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models Christopher Chiu, Silviu Pitis, Mihaela van der Schaar
ICLR 2024 Identifying the Risks of LM Agents with an LM-Emulated Sandbox Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J. Maddison, Tatsunori Hashimoto
NeurIPS 2024 Improving Context-Aware Preference Modeling for Language Models Silviu Pitis, Ziang Xiao, Nicolas Le Roux, Alessandro Sordoni
NeurIPSW 2024 Report Cards: Qualitative Evaluation of LLMs Using Natural Language Summaries Blair Yang, Fuyang Cui, Keiran Paster, Jimmy Ba, Pashootan Vaezipoor, Silviu Pitis, Michael R. Zhang
ICMLW 2023 Calibrating Language Models via Augmented Prompt Ensembles Mingjian Jiang, Yangjun Ruan, Sicong Huang, Saifei Liao, Silviu Pitis, Roger Baker Grosse, Jimmy Ba
NeurIPS 2023 Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards Silviu Pitis
ICMLW 2023 Failure Modes of Learning Reward Models for LLMs and Other Sequence Models Silviu Pitis
NeurIPSW 2023 Identifying the Risks of LM Agents with an LM-Emulated Sandbox Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J. Maddison, Tatsunori Hashimoto
ICLR 2023 Large Language Models Are Human-Level Prompt Engineers Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba
ICMLW 2023 Multi-Objective Agency Requires Non-Markovian Rewards Silviu Pitis
NeurIPSW 2022 Large Language Models Are Human-Level Prompt Engineers Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba
NeurIPS 2022 MoCoDA: Model-Based Counterfactual Data Augmentation Silviu Pitis, Elliot Creager, Ajay Mandlekar, Animesh Garg
ICMLW 2022 MoCoDA: Model-Based Counterfactual Data Augmentation Silviu Pitis, Elliot Creager, Ajay Mandlekar, Animesh Garg
NeurIPSW 2022 Rational Multi-Objective Agents Must Admit Non-Markov Reward Representations Silviu Pitis, Duncan Bailey, Jimmy Ba
NeurIPSW 2022 Return Augmentation Gives Supervised RL Temporal Compositionality Keiran Paster, Silviu Pitis, Sheila A. McIlraith, Jimmy Ba
NeurIPSW 2022 Return Augmentation Gives Supervised RL Temporal Compositionality Keiran Paster, Silviu Pitis, Sheila A. McIlraith, Jimmy Ba
NeurIPSW 2022 Steering Large Language Models Using APE Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba
NeurIPSW 2022 Temporary Goals for Exploration Haoyang Xu, Jimmy Ba, Silviu Pitis, Harris Chan
NeurIPSW 2022 Temporary Goals for Exploration Haoyang Xu, Jimmy Ba, Silviu Pitis, Harris Chan
ICLR 2020 An Inductive Bias for Distances: Neural Nets That Respect the Triangle Inequality Silviu Pitis, Harris Chan, Kiarash Jamali, Jimmy Ba
NeurIPS 2020 Counterfactual Data Augmentation Using Locally Factored Dynamics Silviu Pitis, Elliot Creager, Animesh Garg
AAAI 2020 Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning Kristopher De Asis, Alan Chan, Silviu Pitis, Richard S. Sutton, Daniel Graves
ICML 2020 Maximum Entropy Gain Exploration for Long Horizon Multi-Goal Reinforcement Learning Silviu Pitis, Harris Chan, Stephen Zhao, Bradly Stadie, Jimmy Ba
AAAI 2019 Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach Silviu Pitis
AAAI 2018 Source Traces for Temporal Difference Learning Silviu Pitis