Osher, Stanley
19 publications
NeurIPS
2025
Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency
NeurIPS
2022
Improving Neural Ordinary Differential Equations with Nesterov's Accelerated Gradient Method
NeurIPS
2021
FMMformer: Efficient and Flexible Transformer via Decomposed Near-Field and Far-Field Attention