Cho, Hanseul
10 publications
ICMLW
2024
DASH: Warm-Starting Neural Network Training Without Loss of Plasticity Under Stationarity
NeurIPS
2024
DASH: Warm-Starting Neural Network Training in Stationary Settings Without Loss of Plasticity
NeurIPS
2024
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure
ICMLW
2024
Position Coupling: Leveraging Task Structure for Improved Length Generalization of Transformers