Mendes, Ethan

2 publications

NeurIPS 2025 Language Models Can Self-Improve at State-Value Estimation for Better Search Ethan Mendes, Alan Ritter
NeurIPSW 2023 Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game Sam Toyer, Olivia Watkins, Ethan Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell