Dong, Harry

6 publications

ICLR 2026 Generalized Parallel Scaling with Interdependent Generations Harry Dong, David Brandfonbrener, Eryk Helenowski, Yun He, Mrinal Kumar, Han Fang, Yuejie Chi, Karthik Abinav Sankararaman
ICLR 2026 Stem: Scaling Transformers with Embedding Modules Ranajoy Sadhukhan, Sheng Cao, Harry Dong, Changsheng Zhao, Attiano Purpura-Pontoniere, Yuandong Tian, Zechun Liu, Beidi Chen
ICML 2025 ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, Beidi Chen
ICML 2024 Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference Harry Dong, Xinyu Yang, Zhenyu Zhang, Zhangyang Wang, Yuejie Chi, Beidi Chen
ICMLW 2024 Prompt-Prompted Adaptive Structured Pruning for Efficient LLM Generation Harry Dong, Beidi Chen, Yuejie Chi
ICMLW 2023 Towards Structured Sparsity in Transformers for Efficient Inference Harry Dong, Beidi Chen, Yuejie Chi