Fu, Yonggan
30 publications
ICML
2025
LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
ICLR
2025
LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
NeurIPS
2025
Nemotron-CLIMB: Clustering-Based Iterative Data Mixture Bootstrapping for Language Model Pre-Training
NeurIPS
2024
AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment
NeurIPS
2021
Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found Within Randomly Initialized Networks