ML Anthology
Authors
Search
About
Wu, Ben Peng
3 publications
ICLRW
2025
Antipodal Pairing and Mechanistic Signals in Dense SAE Latents
Alessandro Stolfo
,
Ben Peng Wu
,
Mrinmaya Sachan
NeurIPS
2025
Dense SAE Latents Are Features, Not Bugs
Xiaoqing Sun
,
Alessandro Stolfo
,
Joshua Engels
,
Ben Peng Wu
,
Senthooran Rajamanoharan
,
Mrinmaya Sachan
,
Max Tegmark
ICMLW
2024
Confidence Regulation Neurons in Language Models
Alessandro Stolfo
,
Ben Peng Wu
,
Wes Gurnee
,
Yonatan Belinkov
,
Xingyi Song
,
Mrinmaya Sachan
,
Neel Nanda