Susskind, Joshua M.
50 publications
NeurIPS
2025
Flexible Language Modeling in Continuous Space with Transformer-Based Autoregressive Flows
ICML
2025
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
ICLRW
2025
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
ICMLW
2024
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
ICML
2023
NerfDiff: Single-Image View Synthesis with NeRF-Guided Distillation from 3D-Aware Diffusion