Detecting Critical Treatment Effect Bias in Small Subgroups

Abstract

Randomized trials are considered the gold standard for making informed decisions in medicine. However, they are often not representative of the patient population in clinical practice. Observational studies, on the other hand, cover a broader patient population but are prone to various biases. Thus, before using observational data for any downstream task, it is crucial to benchmark its treatment effect estimates against a randomized trial. We propose a novel strategy to benchmark observational studies on a subgroup level. First, we design a statistical test for the null hypothesis that the treatment effects – conditioned on a subset of relevant features – differ up to some tolerance value. Our test allows us to estimate an asymptotically valid lower bound on the maximum bias strength for any subgroup. We validate our lower bound in a real-world setting and show that it leads to conclusions that align with established medical knowledge.

Cite

Text

De Bartolomeis et al. "Detecting Critical Treatment Effect Bias in Small Subgroups." Uncertainty in Artificial Intelligence, 2024.

Markdown

[De Bartolomeis et al. "Detecting Critical Treatment Effect Bias in Small Subgroups." Uncertainty in Artificial Intelligence, 2024.](https://mlanthology.org/uai/2024/debartolomeis2024uai-detecting/)

BibTeX

@inproceedings{debartolomeis2024uai-detecting,
  title     = {{Detecting Critical Treatment Effect Bias in Small Subgroups}},
  author    = {De Bartolomeis, Piersilvio and Abad, Javier and Donhauser, Konstantin and Yang, Fanny},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2024},
  pages     = {943-965},
  volume    = {244},
  url       = {https://mlanthology.org/uai/2024/debartolomeis2024uai-detecting/}
}