Li, Kenneth

5 publications

ICML 2025 Communicating Activations Between Language Model Agents Vignav Ramesh, Kenneth Li
ICML 2025 When Bad Data Leads to Good Models Kenneth Li, Yida Chen, Fernanda Viégas, Martin Wattenberg
ICML 2024 Q-Probe: A Lightweight Approach to Reward Maximization for Language Models Kenneth Li, Samy Jelassi, Hugh Zhang, Sham M. Kakade, Martin Wattenberg, David Brandfonbrener
ICLR 2023 Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task Kenneth Li, Aspen K Hopkins, David Bau, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
NeurIPS 2023 Inference-Time Intervention: Eliciting Truthful Answers from a Language Model Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg