ControlManip: Few-Shot Manipulation Fine-Tuning via Object-Centric Conditional Control
Abstract
Learning real-world robotic manipulation is challenging, particularly when limited demonstrations are available. Existing methods for few-shot manipulation often rely on simulation-augmented data or pre-built modules like grasping and pose estimation, which struggle with sim-to-real gaps and lack versatility. While large-scale imitation pre-training shows promise, adapting these general-purpose policies to specific tasks in data-scarce settings remains unexplored. To achieve this, we propose ControlManip, a novel framework that bridges pre-trained manipulation policies with object-centric representations via a ControlNet-style architecture for efficient fine-tuning. Specifically, to introduce object-centric conditions without overwriting prior knowledge, ControlManip zero-initializes a set of projection layers, allowing them to gradually adapt the pre-trained manipulation policies. In real-world experiments across 6 diverse tasks, including pouring cubes and folding clothes, our method achieves a 73.3\% success rate while requiring only 10-20 demonstrations --- a significant improvement over traditional approaches that require more than 100 demonstrations to achieve comparable success. Comprehensive studies show that ControlManip improves the few-shot fine-tuning success rate by 252\% over baselines and demonstrates robustness to object and background changes. By lowering the barriers to task development, ControlManip accelerates real-world robot adoption and lays the groundwork for unifying large-scale policy pre-training with object-centric representations.
Cite
Text
Li et al. "ControlManip: Few-Shot Manipulation Fine-Tuning via Object-Centric Conditional Control." ICLR 2025 Workshops: WRL, 2025.Markdown
[Li et al. "ControlManip: Few-Shot Manipulation Fine-Tuning via Object-Centric Conditional Control." ICLR 2025 Workshops: WRL, 2025.](https://mlanthology.org/iclrw/2025/li2025iclrw-controlmanip/)BibTeX
@inproceedings{li2025iclrw-controlmanip,
title = {{ControlManip: Few-Shot Manipulation Fine-Tuning via Object-Centric Conditional Control}},
author = {Li, Puhao and Wu, Yingying and Li, Wanlin and Huang, Yuzhe and Zhang, Zhiyuan and Chen, Yinghan and Zhu, Song-Chun and Liu, Tengyu and Huang, Siyuan},
booktitle = {ICLR 2025 Workshops: WRL},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/li2025iclrw-controlmanip/}
}