ML Anthology
Authors
Search
About
Baek, David D.
4 publications
ICLR
2026
Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth
Jiawei Zhang
,
Andrew Estornell
,
David D. Baek
,
Bo Li
,
Xiaojun Xu
TMLR
2025
Harmonic Loss Trains Interpretable AI Models
David D. Baek
,
Ziming Liu
,
Riya Tyagi
,
Max Tegmark
NeurIPS
2025
Scaling Laws for Scalable Oversight
Joshua Engels
,
David D. Baek
,
Subhash Kantamneni
,
Max Tegmark
ICLRW
2025
Towards Understanding Distilled Reasoning Models: A Representational Approach
David D. Baek
,
Max Tegmark