Data Poisoning Attacks on Off-Policy Policy Evaluation Algorithms
Abstract
Off-policy Evaluation (OPE) methods are crucial for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible or expensive. However, the extent to which such methods can be trusted under adversarial threats to data quality is largely unexplored. In this work, we make the first attempt at investigating the sensitivity of OPE methods to adversarial perturbations to the data. We design a data poisoning attack framework that leverages influence functions to construct perturbations that maximize error in the policy value estimates. Our experimental results show that many OPE methods are highly prone to data poisoning attacks, even for small adversarial perturbations.
Cite
Text
Lobo et al. "Data Poisoning Attacks on Off-Policy Policy Evaluation Algorithms." ICLR 2022 Workshops: PAIR2Struct, 2022.Markdown
[Lobo et al. "Data Poisoning Attacks on Off-Policy Policy Evaluation Algorithms." ICLR 2022 Workshops: PAIR2Struct, 2022.](https://mlanthology.org/iclrw/2022/lobo2022iclrw-data/)BibTeX
@inproceedings{lobo2022iclrw-data,
title = {{Data Poisoning Attacks on Off-Policy Policy Evaluation Algorithms}},
author = {Lobo, Elita and Singh, Harvineet and Petrik, Marek and Rudin, Cynthia and Lakkaraju, Himabindu},
booktitle = {ICLR 2022 Workshops: PAIR2Struct},
year = {2022},
url = {https://mlanthology.org/iclrw/2022/lobo2022iclrw-data/}
}