Coresets for Wasserstein Distributionally Robust Optimization Problems
Abstract
Wasserstein distributionally robust optimization (\textsf{WDRO}) is a popular model to enhance the robustness of machine learning with ambiguous data. However, the complexity of \textsf{WDRO} can be prohibitive in practice since solving its ``minimax'' formulation requires a great amount of computation. Recently, several fast \textsf{WDRO} training algorithms for some specific machine learning tasks (e.g., logistic regression) have been developed. However, the research on designing efficient algorithms for general large-scale \textsf{WDRO}s is still quite limited, to the best of our knowledge. \textit{Coreset} is an important tool for compressing large dataset, and thus it has been widely applied to reduce the computational complexities for many optimization problems. In this paper, we introduce a unified framework to construct the $\epsilon$-coreset for the general \textsf{WDRO} problems. Though it is challenging to obtain a conventional coreset for \textsf{WDRO} due to the uncertainty issue of ambiguous data, we show that we can compute a ``dual coreset'' by using the strong duality property of \textsf{WDRO}. Also, the error introduced by the dual coreset can be theoretically guaranteed for the original \textsf{WDRO} objective. To construct the dual coreset, we propose a novel grid sampling approach that is particularly suitable for the dual formulation of \textsf{WDRO}. Finally, we implement our coreset approach and illustrate its effectiveness for several \textsf{WDRO} problems in the experiments. See \href{https://arxiv.org/abs/2210.04260}arXiv:2210.04260 for the full version of this paper. The code is available at \url{https://github.com/h305142/WDRO_coreset}.
Cite
Text
Huang et al. "Coresets for Wasserstein Distributionally Robust Optimization Problems." Neural Information Processing Systems, 2022.Markdown
[Huang et al. "Coresets for Wasserstein Distributionally Robust Optimization Problems." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/huang2022neurips-coresets/)BibTeX
@inproceedings{huang2022neurips-coresets,
title = {{Coresets for Wasserstein Distributionally Robust Optimization Problems}},
author = {Huang, Ruomin and Huang, Jiawei and Liu, Wenjie and Ding, Hu},
booktitle = {Neural Information Processing Systems},
year = {2022},
url = {https://mlanthology.org/neurips/2022/huang2022neurips-coresets/}
}