Online Learning in Dynamically Changing Environments
Abstract
We study the problem of online learning and online regret minimization when samples are drawn from a general unknown \emph{non-stationary} process. We introduce the concept of a \emph{dynamic changing process} with cost $K$, where the \emph{conditional} marginals of the process can vary arbitrarily, but that the number of different conditional marginals is bounded by $K$ over $T$ rounds. For such processes we prove a tight (upto $\sqrt{\log T}$ factor) bound $O(\sqrt{KT\cdot\vch\log T})$ for the \emph{expected worst case} regret of any finite VC-dimensional class $\mathcal{H}$ under absolute loss (i.e., the expected miss-classification loss). We then improve this bound for general mixable losses, by establishing a tight (up to $\log^3 T$ factor) regret bound $O(K\cdot\vch\log^3 T)$. We extend these results to general \emph{smooth adversary} processes with \emph{unknown} reference measure by showing a sub-linear regret bound for $1$-dimensional threshold functions under a general bounded convex loss. Our results can be viewed as a first step towards regret analysis with non-stationary samples in the \emph{distribution blind} (universal) regime. This also brings a new viewpoint that shifts the study of complexity of the hypothesis classes to the study of the complexity of processes generating data.
Cite
Text
Wu et al. "Online Learning in Dynamically Changing Environments." Conference on Learning Theory, 2023.Markdown
[Wu et al. "Online Learning in Dynamically Changing Environments." Conference on Learning Theory, 2023.](https://mlanthology.org/colt/2023/wu2023colt-online/)BibTeX
@inproceedings{wu2023colt-online,
title = {{Online Learning in Dynamically Changing Environments}},
author = {Wu, Changlong and Grama, Ananth and Szpankowski, Wojciech},
booktitle = {Conference on Learning Theory},
year = {2023},
pages = {325-358},
volume = {195},
url = {https://mlanthology.org/colt/2023/wu2023colt-online/}
}