Steerable Scene Generation with Post Training and Inference-Time Search

Abstract

Training robots in simulation requires diverse 3D scenes that reflect the specific challenges of downstream tasks. However, scenes that satisfy strict task requirements, such as high-clutter environments with plausible spatial arrangement, are rare and costly to curate manually. Instead, we generate large-scale scene data using procedural models that approximate realistic environments for robotic manipulation, and adapt it to task-specific goals. We do this by training a unified diffusion-based generative model that predicts which objects to place from a fixed asset library, along with their SE(3) poses. This model serves as a flexible scene prior that can be adapted using reinforcement learning-based post training, conditional generation, or inference-time search, steering generation toward downstream objectives even when they differ from the original data distribution. Our method enables goal-directed scene synthesis that respects physical feasibility and scales across scene types. We introduce a novel MCTS-based inference-time search strategy for diffusion models, enforce feasibility via projection and simulation, and release a dataset of over 44 million SE(3) scenes spanning five diverse environments.

Cite

Text

Pfaff et al. "Steerable Scene Generation with Post Training and Inference-Time Search." Proceedings of The 9th Conference on Robot Learning, 2025.

Markdown

[Pfaff et al. "Steerable Scene Generation with Post Training and Inference-Time Search." Proceedings of The 9th Conference on Robot Learning, 2025.](https://mlanthology.org/corl/2025/pfaff2025corl-steerable/)

BibTeX

@inproceedings{pfaff2025corl-steerable,
  title     = {{Steerable Scene Generation with Post Training and Inference-Time Search}},
  author    = {Pfaff, Nicholas Ezra and Dai, Hongkai and Zakharov, Sergey and Iwase, Shun and Tedrake, Russ},
  booktitle = {Proceedings of The 9th Conference on Robot Learning},
  year      = {2025},
  pages     = {1690-1702},
  volume    = {305},
  url       = {https://mlanthology.org/corl/2025/pfaff2025corl-steerable/}
}