Deiseroth, Björn

12 publications

ICLR 2025 Hierarchical Autoregressive Transformers: Combining Byte- and Word-Level Processing for Robust, Adaptable Language Models Pit Neitemeier, Björn Deiseroth, Constantin Eichenberg, Lukas Balles
NeurIPS 2025 Measuring and Guiding Monosemanticity Ruben Härle, Felix Friedrich, Manuel Brack, Björn Deiseroth, Stephan Waeldchen, Patrick Schramowski, Kristian Kersting
ICLR 2025 U-$\mu$P: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
ICML 2024 Mechanistic Design and Scaling of Hybrid Architectures Michael Poli, Armin W Thomas, Eric Nguyen, Pragaash Ponnusamy, Björn Deiseroth, Kristian Kersting, Taiji Suzuki, Brian Hie, Stefano Ermon, Christopher Re, Ce Zhang, Stefano Massaroli
NeurIPSW 2024 SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs Ruben Härle, Felix Friedrich, Manuel Brack, Björn Deiseroth, Patrick Schramowski, Kristian Kersting
NeurIPSW 2024 U-$\mu$P: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
ICMLW 2024 U-μP: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
ICMLW 2024 U-μP: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
NeurIPS 2023 ATMAN: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation Björn Deiseroth, Mayukh Deb, Samuel Weinbach, Manuel Brack, Patrick Schramowski, Kristian Kersting
ICML 2023 ILLUME: Rationalizing Vision-Language Models Through Human Interactions Manuel Brack, Patrick Schramowski, Björn Deiseroth, Kristian Kersting
NeurIPS 2023 MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Björn Deiseroth, Constantin Eichenberg, Andrew M Dai, Robert Baldock, Souradeep Nanda, Koen Oostermeijer, Andres Felipe Cruz-Salinas, Patrick Schramowski, Kristian Kersting, Samuel Weinbach
CVPR 2023 Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models Patrick Schramowski, Manuel Brack, Björn Deiseroth, Kristian Kersting