Scaling and Distilling Transformer Models for sEMG
Abstract
Surface electromyography (sEMG) signals offer a promising avenue for developing innovative human-computer interfaces by providing insights into muscular activity. However, limited available training data and computational constraints during deployment have restricted the use of state-of-the-art machine learning models, such as transformers, in challenging sEMG tasks. In this paper, we demonstrate that transformer models can learn effective and generalizable representations from sEMG datasets that are small by modern deep learning standards (approximately 100 users), surpassing the performance of classical machine learning methods and older neural network architectures. Additionally, by leveraging model distillation techniques, we reduce parameter counts by up to 50x with minimal loss of performance. This results in efficient and expressive models suitable for complex real-time sEMG tasks in dynamic real-world environments.
Cite
Text
Mehlman et al. "Scaling and Distilling Transformer Models for sEMG." Transactions on Machine Learning Research, 2025.Markdown
[Mehlman et al. "Scaling and Distilling Transformer Models for sEMG." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/mehlman2025tmlr-scaling/)BibTeX
@article{mehlman2025tmlr-scaling,
title = {{Scaling and Distilling Transformer Models for sEMG}},
author = {Mehlman, Nick and Gagnon-Audet, Jean-Christophe and Shvartsman, Michael and Niu, Kelvin and Miller, Alexander H and Sodhani, Shagun},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/mehlman2025tmlr-scaling/}
}