ForceVLA: Enhancing VLA Models with a Force-Aware MoE for Contact-Rich Manipulation

Abstract

Vision-Language-Action (VLA) models have advanced general-purpose robotic manipulation by leveraging pretrained visual and linguistic representations. However, they struggle with contact-rich tasks that require fine-grained control involving force, especially under visual occlusion or dynamic uncertainty. To address these limitations, we propose \textbf{ForceVLA}, a novel end-to-end manipulation framework that treats external force sensing as a first-class modality within VLA systems. ForceVLA introduces \textbf{FVLMoE}, a force-aware Mixture-of-Experts fusion module that dynamically integrates pretrained visual-language embeddings with real-time 6-axis force feedback during action decoding. This enables context-aware routing across modality-specific experts, enhancing the robot's ability to adapt to subtle contact dynamics. We also introduce \textbf{ForceVLA-Data}, a new dataset comprising synchronized vision, proprioception, and force-torque signals across five contact-rich manipulation tasks. ForceVLA improves average task success by 23.2\% over strong $\pi_0$-based baselines, achieving up to 80\% success in tasks such as plug insertion. Our approach highlights the importance of multimodal integration for dexterous manipulation and sets a new benchmark for physically intelligent robotic control. Code and data will be released at https://sites.google.com/view/forcevla2025/.

Cite

Text

Yu et al. "ForceVLA: Enhancing VLA Models with a Force-Aware MoE for Contact-Rich Manipulation." Advances in Neural Information Processing Systems, 2025.

Markdown

[Yu et al. "ForceVLA: Enhancing VLA Models with a Force-Aware MoE for Contact-Rich Manipulation." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/yu2025neurips-forcevla/)

BibTeX

@inproceedings{yu2025neurips-forcevla,
  title     = {{ForceVLA: Enhancing VLA Models with a Force-Aware MoE for Contact-Rich Manipulation}},
  author    = {Yu, Jiawen and Liu, Hairuo and Yu, Qiaojun and Ren, Jieji and Hao, Ce and Ding, Haitong and Huang, Guangyu and Huang, Guofan and Song, Yan and Cai, Panpan and Zhang, Wenqiang and Lu, Cewu},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/yu2025neurips-forcevla/}
}