ML Anthology
Authors
Search
About
Patel, Oam
3 publications
TMLR
2025
Defending Against Unforeseen Failure Modes with Latent Adversarial Training
Stephen Casper
,
Lennart Schulze
,
Oam Patel
,
Dylan Hadfield-Menell
ICLRW
2024
Preventing Memorized Completions Through White-Box Filtering
Oam Patel
,
Rowan Wang
NeurIPS
2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Kenneth Li
,
Oam Patel
,
Fernanda ViƩgas
,
Hanspeter Pfister
,
Martin Wattenberg