Alman, Josh

7 publications

ICLR 2025 Fundamental Limitations on Subquadratic Alternatives to Transformers Josh Alman, Hantao Yu
NeurIPS 2025 Two Heads Are Better than One: Simulating Large Transformers with Small Ones Hantao Yu, Josh Alman
ICLR 2024 How to Capture Higher-Order Correlations? Generalizing Matrix SoftMax Attention to Kronecker Computation Josh Alman, Zhao Song
NeurIPS 2024 Metric Transforms and Low Rank Representations of Kernels for Fast Attention Timothy Chu, Josh Alman, Gary Miller, Shyam Narayanan, Mark Sellke, Zhao Song
NeurIPS 2024 The Fine-Grained Complexity of Gradient Computation for Training Large Language Models Josh Alman, Zhao Song
NeurIPS 2023 Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing Josh Alman, 杰昊 梁, Zhao Song, Ruizhe Zhang, Danyang Zhuo
NeurIPS 2023 Fast Attention Requires Bounded Entries Josh Alman, Zhao Song