AutoSimulate: (Quickly) Learning Synthetic Data Generation

Abstract

Simulation is increasingly being used for generating large labelled datasets in many machine learning problems. Recent methods have focused on adjusting simulator parameters with the goal of maximising accuracy on a validation task, usually relying on REINFORCE-like gradient estimators. However these approaches are very expensive as they treat the entire data generation, model training, and validation pipeline as a black-box and require multiple costly objective evaluations at each iteration. We propose an efficient alternative for optimal synthetic data generation, based on a novel differentiable approximation of the objective. This allows us to optimize the simulator, which may be non-differentiable, requiring only one objective evaluation at each iteration with a little overhead. We demonstrate on a state-of-the-art photorealistic renderer that the proposed method finds the optimal data distribution faster (up to 50 times), with significantly reduced training data generation and better accuracy than previous methods.

Cite

Text

Behl et al. "AutoSimulate: (Quickly) Learning Synthetic Data Generation." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58542-6_16

Markdown

[Behl et al. "AutoSimulate: (Quickly) Learning Synthetic Data Generation." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/behl2020eccv-autosimulate/) doi:10.1007/978-3-030-58542-6_16

BibTeX

@inproceedings{behl2020eccv-autosimulate,
  title     = {{AutoSimulate: (Quickly) Learning Synthetic Data Generation}},
  author    = {Behl, Harkirat Singh and Baydin, Atilim Güneş and Gal, Ran and Torr, Philip H.S. and Vineet, Vibhav},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58542-6_16},
  url       = {https://mlanthology.org/eccv/2020/behl2020eccv-autosimulate/}
}