GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning
Abstract
Placing and orienting a camera to compose aesthetically meaningful shots of a scene is not only a key objective in real-world photography and cinematography but also for virtual content creation. The framing of a camera often significantly contributes to the story telling in movies, games, and mixed reality applications. Generating single camera poses or even contiguous trajectories either requires a significant amount of manual labor or requires solving high-dimensional optimization problems, which can be computationally demanding and error-prone. In this paper, we introduce GAIT, a framework for training a Deep Reinforcement Learning (DRL) agent, that learns to automatically control a camera to generate a sequence of aesthetically meaningful views for synthetic 3D indoor scenes. To generate sequences of frames with high aesthetic value, GAIT relies on a neural aesthetics estimator, which is trained on a crowed-sourced dataset. Additionally, we introduce regularization techniques for diversity and smoothness to generate visually interesting trajectories for a 3D environment, and to constrain agent acceleration in the reward function to generate a smooth sequence of camera frames. We validated our method by comparing it to baseline algorithms, based on a perceptual user study, and through ablation studies. Code and visual results are available on the project website: https://desaixie.github.io/gait-rl
Cite
Text
Xie et al. "GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00681Markdown
[Xie et al. "GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/xie2023iccv-gait/) doi:10.1109/ICCV51070.2023.00681BibTeX
@inproceedings{xie2023iccv-gait,
title = {{GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning}},
author = {Xie, Desai and Hu, Ping and Sun, Xin and Pirk, Soren and Zhang, Jianming and Mech, Radomir and Kaufman, Arie E.},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {7409-7419},
doi = {10.1109/ICCV51070.2023.00681},
url = {https://mlanthology.org/iccv/2023/xie2023iccv-gait/}
}