A Simple Approach for Visual Room Rearrangement: 3D Mapping and Semantic Search
Abstract
Physically rearranging objects is an important capability for embodied agents. Visual room rearrangement evaluates an agent's ability to rearrange objects in a room to a desired goal based solely on visual input. We propose a simple yet effective method for this problem: (1) search for and map which objects need to be rearranged, and (2) rearrange each object until the task is complete. Our approach consists of an off-the-shelf semantic segmentation model, voxel-based semantic map, and semantic search policy to efficiently find objects that need to be rearranged. Our method was the winning submission to the AI2-THOR Rearrangement Challenge in the 2022 Embodied AI Workshop at CVPR 2022, and improves on current state-of-the-art end-to-end reinforcement learning-based methods that learn visual room rearrangement policies from 0.53% correct rearrangement to 16.56%, using only 2.7% as many samples from the environment.
Cite
Text
Trabucco et al. "A Simple Approach for Visual Room Rearrangement: 3D Mapping and Semantic Search." International Conference on Learning Representations, 2023.Markdown
[Trabucco et al. "A Simple Approach for Visual Room Rearrangement: 3D Mapping and Semantic Search." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/trabucco2023iclr-simple/)BibTeX
@inproceedings{trabucco2023iclr-simple,
title = {{A Simple Approach for Visual Room Rearrangement: 3D Mapping and Semantic Search}},
author = {Trabucco, Brandon and Sigurdsson, Gunnar A and Piramuthu, Robinson and Sukhatme, Gaurav S. and Salakhutdinov, Ruslan},
booktitle = {International Conference on Learning Representations},
year = {2023},
url = {https://mlanthology.org/iclr/2023/trabucco2023iclr-simple/}
}