Noisy Derivative-Free Optimization with Value Suppression

Abstract

Derivative-free optimization has shown advantage in solving sophisticated problems such as policy search, when the environment is noise-free. Many real-world environments are noisy, where solution evaluations are inaccurate due to the noise. Noisy evaluation can badly injure derivative-free optimization, as it may make a worse solution looks better. Sampling is a straightforward way to reduce noise, while previous studies have shown that delay the noise handling to the comparison time point (i.e., threshold selection) can be helpful for derivative-free optimization. This work further delays the noise handling, and proposes a simple noise handling mechanism, i.e., value suppression. By value suppression, we do nothing about noise until the best-so-far solution has not been improved for a period, and then suppress the value of the best-so-far solution and continue the optimization. On synthetic problems as well as reinforcement learning tasks, experiments verify that value suppression can be significantly more effective than the previous methods.

Cite

Text

Wang et al. "Noisy Derivative-Free Optimization with Value Suppression." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.11534

Markdown

[Wang et al. "Noisy Derivative-Free Optimization with Value Suppression." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/wang2018aaai-noisy/) doi:10.1609/AAAI.V32I1.11534

BibTeX

@inproceedings{wang2018aaai-noisy,
  title     = {{Noisy Derivative-Free Optimization with Value Suppression}},
  author    = {Wang, Hong and Qian, Hong and Yu, Yang},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {1447-1454},
  doi       = {10.1609/AAAI.V32I1.11534},
  url       = {https://mlanthology.org/aaai/2018/wang2018aaai-noisy/}
}