Online Learning of Quantum States
Abstract
Suppose we have many copies of an unknown n-qubit state $\rho$. We measure some copies of $\rho$ using a known two-outcome measurement E_1, then other copies using a measurement E_2, and so on. At each stage t, we generate a current hypothesis $\omega_t$ about the state $\rho$, using the outcomes of the previous measurements. We show that it is possible to do this in a way that guarantees that $|\trace(E_i \omega_t) - \trace(E_i\rho)|$, the error in our prediction for the next measurement, is at least $eps$ at most $O(n / eps^2) $\ times. Even in the non-realizable setting---where there could be arbitrary noise in the measurement outcomes---we show how to output hypothesis states that incur at most $O(\sqrt {Tn}) $ excess loss over the best possible state on the first $T$ measurements. These results generalize a 2007 theorem by Aaronson on the PAC-learnability of quantum states, to the online and regret-minimization settings. We give three different ways to prove our results---using convex optimization, quantum postselection, and sequential fat-shattering dimension---which have different advantages in terms of parameters and portability.
Cite
Text
Aaronson et al. "Online Learning of Quantum States." Neural Information Processing Systems, 2018.Markdown
[Aaronson et al. "Online Learning of Quantum States." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/aaronson2018neurips-online/)BibTeX
@inproceedings{aaronson2018neurips-online,
title = {{Online Learning of Quantum States}},
author = {Aaronson, Scott and Chen, Xinyi and Hazan, Elad and Kale, Satyen and Nayak, Ashwin},
booktitle = {Neural Information Processing Systems},
year = {2018},
pages = {8962-8972},
url = {https://mlanthology.org/neurips/2018/aaronson2018neurips-online/}
}