Trustworthy Policy Learning Under the Counterfactual No-Harm Criterion

Abstract

Trustworthy policy learning has significant importance in making reliable and harmless treatment decisions for individuals. Previous policy learning approaches aim at the well-being of subgroups by maximizing the utility function (e.g., conditional average causal effects, post-view click-through&conversion rate in recommendations), however, individual-level counterfactual no-harm criterion has rarely been discussed. In this paper, we first formalize the counterfactual no-harm criterion for policy learning from a principal stratification perspective. Next, we propose a novel upper bound for the fraction negatively affected by the policy and show the consistency and asymptotic normality of the estimator. Based on the estimators for the policy utility and harm upper bounds, we further propose a policy learning approach that satisfies the counterfactual no-harm criterion, and prove its consistency to the optimal policy reward for parametric and non-parametric policy classes, respectively. Extensive experiments are conducted to show the effectiveness of the proposed policy learning approach for satisfying the counterfactual no-harm criterion.

Cite

Text

Li et al. "Trustworthy Policy Learning Under the Counterfactual No-Harm Criterion." International Conference on Machine Learning, 2023.

Markdown

[Li et al. "Trustworthy Policy Learning Under the Counterfactual No-Harm Criterion." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/li2023icml-trustworthy/)

BibTeX

@inproceedings{li2023icml-trustworthy,
  title     = {{Trustworthy Policy Learning Under the Counterfactual No-Harm Criterion}},
  author    = {Li, Haoxuan and Zheng, Chunyuan and Cao, Yixiao and Geng, Zhi and Liu, Yue and Wu, Peng},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {20575-20598},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/li2023icml-trustworthy/}
}