Song, Yujin

6 publications

NeurIPS 2025 From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers Ryotaro Kawata, Yujin Song, Alberto Bietti, Naoki Nishikawa, Taiji Suzuki, Samuel Vaiter, Denny Wu
NeurIPS 2025 How Does Label Noise Gradient Descent Improve Generalization in the Low SNR Regime? Wei Huang, Andi Han, Yujin Song, Yilan Chen, Denny Wu, Difan Zou, Taiji Suzuki
ICML 2025 Nonlinear Transformers Can Perform Inference-Time Feature Learning Naoki Nishikawa, Yujin Song, Kazusato Oko, Denny Wu, Taiji Suzuki
COLT 2024 Learning Sum of Diverse Features: Computational Hardness and Efficient Gradient-Based Training for Ridge Combinations Kazusato Oko, Yujin Song, Taiji Suzuki, Denny Wu
NeurIPS 2024 Pretrained Transformer Efficiently Learns Low-Dimensional Target Functions In-Context Kazusato Oko, Yujin Song, Taiji Suzuki, Denny Wu
ICMLW 2024 Transformer Efficiently Learns Low-Dimensional Target Functions In-Context Yujin Song, Denny Wu, Kazusato Oko, Taiji Suzuki