Yang, Chiwun

8 publications

CPAL 2025 Curse of Attention: A Kernel-Based Perspective for Why Transformers Fail to Generalize on Time Series Forecasting and Beyond Yekun Ke, Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang

NeurIPS 2025 Efficient $k$-Sparse Band–Limited Interpolation with Improved Approximation Ratio Yang Cao, Xiaoyu Li, Zhao Song, Chiwun Yang

ICLRW 2025 How Sparse Attention Approximates Exact Attention?Your Attention Is Naturally $n^C$-Sparse Zhao Song, Jing Xiong, Chiwun Yang

ICML 2025 ParallelComp: Parallel Long-Context Compressor for Length Extrapolation Jing Xiong, Jianghan Shen, Chuanyang Zheng, Zhongwei Wan, Chenyang Zhao, Chiwun Yang, Fanghua Ye, Hongxia Yang, Lingpeng Kong, Ngai Wong

ICLRW 2025 Towards Infinite-Long Prefix in Transformers Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang

CPAL 2025 Unlock the Theory Behind Scaling 1-Bit Neural Networks Majid Daliri, Zhao Song, Chiwun Yang

ICLRW 2025 Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation Yang Cao, Zhao Song, Chiwun Yang

AAAI 2024 How to Protect Copyright Data in Optimization of Large Language Models? Timothy Chu, Zhao Song, Chiwun Yang