Ternary Momentum for Quantized Training
Abstract
Quantization enables efficient inference on resource-limited devices, yet training still depends on high-precision gradients and optimizer states. We address this gap by introducing stochastic ternary momentum, a fully quantized optimizer that operates with quantized parameters, ternary gradient information, and enables ternary momentum states for stable and memory efficient quantized optimization. Our method replaces deterministic and full-precision updates with integer-valued updates driven by stochastic sampling, ensuring that expected updates match standard momentum while maintaining strict memory constraints. It eliminates re-quantization overhead and preserves quantization consistency throughout training. We establish theoretical convergence guarantees of our ternary momentum method for convex objectives over bounded integer domains and for non-convex objectives over unbounded integer domains. Experiments on vision and language tasks demonstrate that our approach retains strong performance while reducing optimizer memory by 95\% compared to full-precision, advancing the feasibility of fully quantized training.
Cite
Text
Bar et al. "Ternary Momentum for Quantized Training." Transactions on Machine Learning Research, 2026.Markdown
[Bar et al. "Ternary Momentum for Quantized Training." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/bar2026tmlr-ternary/)BibTeX
@article{bar2026tmlr-ternary,
title = {{Ternary Momentum for Quantized Training}},
author = {Bar, Noga and Attia, Amit and Moshkovitz, Michal and Di Castro, Dotan},
journal = {Transactions on Machine Learning Research},
year = {2026},
url = {https://mlanthology.org/tmlr/2026/bar2026tmlr-ternary/}
}