GRS: Generating Robotic Simulation Tasks from Real-World Images

Zook, Alex; Sun, Fan-Yun; Spjut, Josef B.; Blukis, Valts; Birchfield, Stan; Tremblay, Jonathan

GRS: Generating Robotic Simulation Tasks from Real-World Images

Alex Zook, Fan-Yun Sun, Josef B. Spjut, Valts Blukis, Stan Birchfield, Jonathan Tremblay

CVPRW 2025 pp. 594-603

/cvprw/2025/zook2025cvprw-grs/

Abstract

We introduce GRS (Generating Robotic Simulation tasks), a system addressing real-to-sim for robotic simulations. GRS creates digital twin simulations from single RGB-D observations with solvable tasks for virtual agent training. Using vision-language models (VLMs), our pipeline operates in three stages: 1) scene comprehension with SAM2 for segmentation and object description, 2) matching objects with simulation-ready assets, and 3) generating appropriate tasks. We ensure simulation-task alignment through generated test suites and introduce a router that iteratively refines both simulation and test code. Experiments demonstrate our system's effectiveness in object correspondence and task environment generation through our novel router mechanism.

PDF CVPRW Semantic Scholar

Cite

Text

Zook et al. "GRS: Generating Robotic Simulation Tasks from Real-World Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.

Markdown

[Zook et al. "GRS: Generating Robotic Simulation Tasks from Real-World Images." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.](https://mlanthology.org/cvprw/2025/zook2025cvprw-grs/)

BibTeX

@inproceedings{zook2025cvprw-grs,
  title     = {{GRS: Generating Robotic Simulation Tasks from Real-World Images}},
  author    = {Zook, Alex and Sun, Fan-Yun and Spjut, Josef B. and Blukis, Valts and Birchfield, Stan and Tremblay, Jonathan},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2025},
  pages     = {594-603},
  url       = {https://mlanthology.org/cvprw/2025/zook2025cvprw-grs/}
}