Gu, Albert

39 publications

TMLR 2025 Chimera: State Space Models Beyond Sequences Aakash Lahoti, Tanya Marwah, Ratish Puduppully, Albert Gu

ICLRW 2025 Chimera: State Space Models Beyond Sequences Aakash Lahoti, Tanya Marwah, Ratish Puduppully, Albert Gu

ICLRW 2025 HybriDNA: A Hybird Transformer-Mamba2 Long-Range DNA Language Model Mingqian Ma, Guoqing Liu, Chuan Cao, Pan Deng, Tri Dao, Albert Gu, Peiran Jin, Zhao Yang, Yingce Xia, Renqian Luo, Pipi Hu, Zun Wang, Yuan-Jyue Chen, Haiguang Liu, Tao Qin

ICLRW 2025 Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing Aviv Bick, Tobias Katsch, Nimit Sharad Sohoni, Arjun D Desai, Albert Gu

ICLR 2025 On the Benefits of Memory for Modeling Time-Dependent PDEs Ricardo Buitrago, Tanya Marwah, Albert Gu, Andrej Risteski

ICLRW 2025 Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners Daniele Paliotta, Junxiong Wang, Matteo Pagliardini, Kevin Li, Aviv Bick, Albert Gu, François Fleuret, Tri Dao

ICML 2025 Understanding and Improving Length Generalization in Recurrent Models Ricardo Buitrago, Albert Gu

ICML 2025 Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism Aviv Bick, Eric Xing, Albert Gu

ICML 2024 Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling Yair Schiff, Chia Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov

ICMLW 2024 Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling Yair Schiff, Chia Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov

ICMLW 2024 Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling Yair Schiff, Chia Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov

ICMLW 2024 Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling Yair Schiff, Chia Hsiang Kao, Aaron Gokaslan, Tri Dao, Albert Gu, Volodymyr Kuleshov

NeurIPS 2024 Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers Sukjun Hwang, Aakash Lahoti, Ratish Puduppully, Tri Dao, Albert Gu

ICML 2024 Transformers Are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu

NeurIPS 2024 Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models Aviv Bick, Kevin Y. Li, Eric P. Xing, J. Zico Kolter, Albert Gu

ICLR 2023 How to Train Your HIPPO: State Space Models with Generalized Orthogonal Basis Projections Albert Gu, Isys Johnson, Aman Timalsina, Atri Rudra, Christopher Re

ICLR 2023 Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN David M Knigge, David W. Romero, Albert Gu, Efstratios Gavves, Erik J Bekkers, Jakub Mikolaj Tomczak, Mark Hoogendoorn, Jan-jakob Sonke

ICML 2023 Resurrecting Recurrent Neural Networks for Long Sequences Antonio Orvieto, Samuel L Smith, Albert Gu, Anushan Fernando, Caglar Gulcehre, Razvan Pascanu, Soham De

NeurIPS 2023 Structured State Space Models for In-Context Reinforcement Learning Chris Lu, Yannick Schroecker, Albert Gu, Emilio Parisotto, Jakob Foerster, Satinder P. Singh, Feryal Behbahani

ICMLW 2023 Structured State Space Models for In-Context Reinforcement Learning Chris Lu, Yannick Schroecker, Albert Gu, Emilio Parisotto, Jakob Nicolaus Foerster, Satinder Singh, Feryal Behbahani

NeurIPS 2022 Diagonal State Spaces Are as Effective as Structured State Spaces Ankit Gupta, Albert Gu, Jonathan Berant

ICLR 2022 Efficiently Modeling Long Sequences with Structured State Spaces Albert Gu, Karan Goel, Christopher Re

ICML 2022 It’s Raw! Audio Generation with State-Space Models Karan Goel, Albert Gu, Chris Donahue, Christopher Re

NeurIPS 2022 On the Parameterization and Initialization of Diagonal State Space Models Albert Gu, Karan Goel, Ankit Gupta, Christopher Ré

NeurIPS 2022 S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces Eric Nguyen, Karan Goel, Albert Gu, Gordon Downs, Preey Shah, Tri Dao, Stephen Baccus, Christopher Ré

ICML 2021 Catformer: Designing Stable Transformers via Sensitivity Analysis Jared Q Davis, Albert Gu, Krzysztof Choromanski, Tri Dao, Christopher Re, Chelsea Finn, Percy Liang

NeurIPS 2021 Combining Recurrent, Convolutional, and Continuous-Time Models with Linear State Space Layers Albert Gu, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, Christopher Ré

ICML 2021 HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections Ines Chami, Albert Gu, Dat P Nguyen, Christopher Re

ICLR 2021 Model Patching: Closing the Subgroup Performance Gap with Data Augmentation Karan Goel, Albert Gu, Yixuan Li, Christopher Re

NeurIPS 2020 From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering Ines Chami, Albert Gu, Vaggos Chatziafratis, Christopher Ré

NeurIPS 2020 HiPPO: Recurrent Memory with Optimal Polynomial Projections Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, Christopher Ré

ICML 2020 Improving the Gating Mechanism of Recurrent Neural Networks Albert Gu, Caglar Gulcehre, Thomas Paine, Matt Hoffman, Razvan Pascanu

ICLR 2020 Kaleidoscope: An Efficient, Learnable Representation for All Structured Linear Maps Tri Dao, Nimit Sohoni, Albert Gu, Matthew Eichhorn, Amit Blonder, Megan Leszczynski, Atri Rudra, Christopher Ré

NeurIPS 2020 No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems Nimit Sohoni, Jared Dunnmon, Geoffrey Angus, Albert Gu, Christopher Ré

ICML 2019 A Kernel Theory of Modern Data Augmentation Tri Dao, Albert Gu, Alexander Ratner, Virginia Smith, Chris De Sa, Christopher Re

ICML 2019 Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations Tri Dao, Albert Gu, Matthew Eichhorn, Atri Rudra, Christopher Re

ICLR 2019 Learning Mixed-Curvature Representations in Product Spaces Albert Gu, Frederic Sala, Beliz Gunel, Christopher Ré

NeurIPS 2018 Learning Compressed Transforms with Low Displacement Rank Anna Thomas, Albert Gu, Tri Dao, Atri Rudra, Christopher Ré

ICML 2018 Representation Tradeoffs for Hyperbolic Embeddings Frederic Sala, Chris De Sa, Albert Gu, Christopher Re