You, Haoran
16 publications
ICML
2025
LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
NeurIPS
2024
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
16 publications