Online Learning of Quantum States

Abstract

Suppose we have many copies of an unknown n-qubit state $\rho$. We measure some copies of $\rho$ using a known two-outcome measurement E_1, then other copies using a measurement E_2, and so on. At each stage t, we generate a current hypothesis $\omega_t$ about the state $\rho$, using the outcomes of the previous measurements. We show that it is possible to do this in a way that guarantees that $|\trace(E_i \omega_t) - \trace(E_i\rho)|$, the error in our prediction for the next measurement, is at least $eps$ at most $O(n / eps^2) $\ times. Even in the non-realizable setting---where there could be arbitrary noise in the measurement outcomes---we show how to output hypothesis states that incur at most $O(\sqrt {Tn}) $ excess loss over the best possible state on the first $T$ measurements. These results generalize a 2007 theorem by Aaronson on the PAC-learnability of quantum states, to the online and regret-minimization settings. We give three different ways to prove our results---using convex optimization, quantum postselection, and sequential fat-shattering dimension---which have different advantages in terms of parameters and portability.

Cite

Text

Aaronson et al. "Online Learning of Quantum States." Neural Information Processing Systems, 2018.

Markdown

[Aaronson et al. "Online Learning of Quantum States." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/aaronson2018neurips-online/)

BibTeX

@inproceedings{aaronson2018neurips-online,
  title     = {{Online Learning of Quantum States}},
  author    = {Aaronson, Scott and Chen, Xinyi and Hazan, Elad and Kale, Satyen and Nayak, Ashwin},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {8962-8972},
  url       = {https://mlanthology.org/neurips/2018/aaronson2018neurips-online/}
}