Lee, Andrew

8 publications

ICLR 2025 ICLR: In-Context Learning of Representations Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana, Yongyi Yang, Maya Okawa, Kento Nishi, Martin Wattenberg, Hidenori Tanaka
ICLRW 2025 Shared Global and Local Geometry of Language Model Embeddings Andrew Lee, Fernanda Viégas, Martin Wattenberg
ICML 2024 A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity Andrew Lee, Xiaoyan Bai, Itamar Pres, Martin Wattenberg, Jonathan K. Kummerfeld, Rada Mihalcea
NeurIPS 2024 Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space Core Francisco Park, Maya Okawa, Andrew Lee, Hidenori Tanaka, Ekdeep Singh Lubana
IJCAI 2024 Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search Abbas Mehrabian, Ankit Anand, Hyunjik Kim, Nicolas Sonnerat, Matej Balog, Gheorghe Comanici, Tudor Berariu, Andrew Lee, Anian Ruoss, Anna Bulanova, Daniel Toyama, Sam Blackwell, Bernardino Romera-Paredes, Petar Velickovic, Laurent Orseau, Joonkyung Lee, Anurag Murty Naredla, Doina Precup, Adam Zsolt Wagner
ICMLW 2024 Hidden Learning Dynamics of Capability Before Behavior in Diffusion Models Core Francisco Park, Maya Okawa, Andrew Lee, Ekdeep Singh Lubana, Hidenori Tanaka
NeurIPSW 2024 Structured In-Context Task Representations Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana, Kento Nishi, Maya Okawa, Hidenori Tanaka
NeurIPSW 2023 Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search Abbas Mehrabian, Ankit Anand, Hyunjik Kim, Nicolas Sonnerat, Tudor Berariu, Matej Balog, Gheorghe Comanici, Andrew Lee, Anian Ruoss, Anna Bulanova, Daniel Toyama, Sam Blackwell, Bernardino Romera Paredes, Laurent Orseau, Petar Veličković, Anurag Murty Naredla, Joonkyung Lee, Adam Zsolt Wagner, Doina Precup