ML Anthology
Authors
Search
About
Pitis, Silviu
25 publications
NeurIPS
2025
Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models
Christopher Chiu
,
Silviu Pitis
,
Mihaela van der Schaar
ICLR
2024
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Yangjun Ruan
,
Honghua Dong
,
Andrew Wang
,
Silviu Pitis
,
Yongchao Zhou
,
Jimmy Ba
,
Yann Dubois
,
Chris J. Maddison
,
Tatsunori Hashimoto
NeurIPS
2024
Improving Context-Aware Preference Modeling for Language Models
Silviu Pitis
,
Ziang Xiao
,
Nicolas Le Roux
,
Alessandro Sordoni
NeurIPSW
2024
Report Cards: Qualitative Evaluation of LLMs Using Natural Language Summaries
Blair Yang
,
Fuyang Cui
,
Keiran Paster
,
Jimmy Ba
,
Pashootan Vaezipoor
,
Silviu Pitis
,
Michael R. Zhang
ICMLW
2023
Calibrating Language Models via Augmented Prompt Ensembles
Mingjian Jiang
,
Yangjun Ruan
,
Sicong Huang
,
Saifei Liao
,
Silviu Pitis
,
Roger Baker Grosse
,
Jimmy Ba
NeurIPS
2023
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
ICMLW
2023
Failure Modes of Learning Reward Models for LLMs and Other Sequence Models
Silviu Pitis
NeurIPSW
2023
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Yangjun Ruan
,
Honghua Dong
,
Andrew Wang
,
Silviu Pitis
,
Yongchao Zhou
,
Jimmy Ba
,
Yann Dubois
,
Chris J. Maddison
,
Tatsunori Hashimoto
ICLR
2023
Large Language Models Are Human-Level Prompt Engineers
Yongchao Zhou
,
Andrei Ioan Muresanu
,
Ziwen Han
,
Keiran Paster
,
Silviu Pitis
,
Harris Chan
,
Jimmy Ba
ICMLW
2023
Multi-Objective Agency Requires Non-Markovian Rewards
Silviu Pitis
NeurIPSW
2022
Large Language Models Are Human-Level Prompt Engineers
Yongchao Zhou
,
Andrei Ioan Muresanu
,
Ziwen Han
,
Keiran Paster
,
Silviu Pitis
,
Harris Chan
,
Jimmy Ba
NeurIPS
2022
MoCoDA: Model-Based Counterfactual Data Augmentation
Silviu Pitis
,
Elliot Creager
,
Ajay Mandlekar
,
Animesh Garg
ICMLW
2022
MoCoDA: Model-Based Counterfactual Data Augmentation
Silviu Pitis
,
Elliot Creager
,
Ajay Mandlekar
,
Animesh Garg
NeurIPSW
2022
Rational Multi-Objective Agents Must Admit Non-Markov Reward Representations
Silviu Pitis
,
Duncan Bailey
,
Jimmy Ba
NeurIPSW
2022
Return Augmentation Gives Supervised RL Temporal Compositionality
Keiran Paster
,
Silviu Pitis
,
Sheila A. McIlraith
,
Jimmy Ba
NeurIPSW
2022
Return Augmentation Gives Supervised RL Temporal Compositionality
Keiran Paster
,
Silviu Pitis
,
Sheila A. McIlraith
,
Jimmy Ba
NeurIPSW
2022
Steering Large Language Models Using APE
Yongchao Zhou
,
Andrei Ioan Muresanu
,
Ziwen Han
,
Keiran Paster
,
Silviu Pitis
,
Harris Chan
,
Jimmy Ba
NeurIPSW
2022
Temporary Goals for Exploration
Haoyang Xu
,
Jimmy Ba
,
Silviu Pitis
,
Harris Chan
NeurIPSW
2022
Temporary Goals for Exploration
Haoyang Xu
,
Jimmy Ba
,
Silviu Pitis
,
Harris Chan
ICLR
2020
An Inductive Bias for Distances: Neural Nets That Respect the Triangle Inequality
Silviu Pitis
,
Harris Chan
,
Kiarash Jamali
,
Jimmy Ba
NeurIPS
2020
Counterfactual Data Augmentation Using Locally Factored Dynamics
Silviu Pitis
,
Elliot Creager
,
Animesh Garg
AAAI
2020
Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
Kristopher De Asis
,
Alan Chan
,
Silviu Pitis
,
Richard S. Sutton
,
Daniel Graves
ICML
2020
Maximum Entropy Gain Exploration for Long Horizon Multi-Goal Reinforcement Learning
Silviu Pitis
,
Harris Chan
,
Stephen Zhao
,
Bradly Stadie
,
Jimmy Ba
AAAI
2019
Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach
Silviu Pitis
AAAI
2018
Source Traces for Temporal Difference Learning
Silviu Pitis