Lee, Heejun

6 publications

ICLR 2025 A Training-Free Sub-Quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention Heejun Lee, Geon Park, Youngwan Lee, Jaduk Suh, Jina Kim, Wonyong Jeong, Bumsik Kim, Hyemin Lee, Myeongjae Jeon, Sung Ju Hwang
NeurIPS 2025 Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction Jeffrey Willette, Heejun Lee, Sung Ju Hwang
CVPRW 2025 Quantized Image Super-Resolution on Mobile NPUs, Mobile AI 2025 Challenge: Report Andrey Ignatov, Georgy Perevozchikov, Radu Timofte, Zhiyu Zhang, Tianxiao Gao, Yukun Yang, Shiai Zhu, Shihao Wang, Kihwan Yoon, Ganzorig Gankhuyag, Hyeon-Cheol Moon, Taehyun Jeong, Yumi Kim, Suhyeon Lee, Jaehun Baek, Jinwoo Jeong, Eunjun Park, Jun Lee, Heejun Lee, Sungjei Kim, Dafeng Zhang, Yong Yang, Heo Myeong Cheol, Yonghyun Park, Jooho Jeong, Wontae Kim, Kanghwan Lee, Diankai Zhang, Biao Wu, Chengjian Zheng, Shaoli Liu, Si Gao, Ning Wang, Mingshen Wang, Zhao Zhang, Suiyi Zhao, Jinhan Guan, Bo Wang, Yan Luo
ICLR 2025 Training Free Exponential Context Extension via Cascading KV Cache Jeffrey Willette, Heejun Lee, Youngwan Lee, Myeongjae Jeon, Sung Ju Hwang
ICLR 2024 SEA: Sparse Linear Attention with Estimated Attention Mask Heejun Lee, Jina Kim, Jeffrey Willette, Sung Ju Hwang
ICLR 2023 Sparse Token Transformer with Attention Back Tracking Heejun Lee, Minki Kang, Youngwan Lee, Sung Ju Hwang