Beyond Low-Rank Decomposition: A Shortcut Approach for Efficient On-Device Learning

Abstract

On-device learning has emerged as a promising direction for AI development, particularly because of its potential to reduce latency issues and mitigate privacy risks associated with device-server communication, while improving energy efficiency. Despite these advantages, significant memory and computational constraints still represent major challenges for its deployment. Drawing on previous studies on low-rank decomposition methods that address activation memory bottlenecks in backpropagation, we propose a novel shortcut approach as an alternative. Our analysis and experiments demonstrate that our method can reduce activation memory usage, even up to $120.09\times$ compared to vanilla training, while also reducing overall training FLOPs up to $1.86\times$ when evaluated on traditional benchmarks.

Cite

Text

Nguyen et al. "Beyond Low-Rank Decomposition: A Shortcut Approach for Efficient On-Device Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Nguyen et al. "Beyond Low-Rank Decomposition: A Shortcut Approach for Efficient On-Device Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/nguyen2025icml-beyond-a/)

BibTeX

@inproceedings{nguyen2025icml-beyond-a,
  title     = {{Beyond Low-Rank Decomposition: A Shortcut Approach for Efficient On-Device Learning}},
  author    = {Nguyen, Le-Trung and Quélennec, Aël and Nguyen, Van-Tam and Tartaglione, Enzo},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {46196-46210},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/nguyen2025icml-beyond-a/}
}