Interpretation Attacks and Defenses on Predictive Models Using Electronic Health Records

Razmi, Fereshteh; Lou, Jian; Hong, Yuan; Xiong, Li

doi:10.1007/978-3-031-43418-1_27

Interpretation Attacks and Defenses on Predictive Models Using Electronic Health Records

Fereshteh Razmi, Jian Lou, Yuan Hong, Li Xiong

ECML-PKDD 2023 pp. 446-461

doi:10.1007/978-3-031-43418-1_27 /ecmlpkdd/2023/razmi2023ecmlpkdd-interpretation/

Abstract

The emergence of complex deep neural networks made it crucial to employ interpretation methods for gaining insight into the rationale behind model predictions. However, recent studies have revealed attacks on these interpretations, which aim to deceive users and subvert the trustworthiness of the models. It is especially critical in medical systems, where interpretations are essential in explaining outcomes. This paper presents the first interpretation attack on predictive models using sequential electronic health records (EHRs). Prior attempts in image interpretation mainly utilized gradient-based methods, yet our research shows that our attack can attain significant success on EHR interpretations that do not rely on model gradients. We introduce metrics compatible with EHR data to evaluate the attack’s success. Moreover, our findings demonstrate that detection methods that have successfully identified conventional adversarial examples are ineffective against our attack. We then propose a defense method utilizing auto-encoders to de-noise the data and improve the interpretations’ robustness. Our results indicate that this de-noising method outperforms the widely used defense method, SmoothGrad, which is based on adding noise to the data.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Razmi et al. "Interpretation Attacks and Defenses on Predictive Models Using Electronic Health Records." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43418-1_27

Markdown

[Razmi et al. "Interpretation Attacks and Defenses on Predictive Models Using Electronic Health Records." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/razmi2023ecmlpkdd-interpretation/) doi:10.1007/978-3-031-43418-1_27

BibTeX

@inproceedings{razmi2023ecmlpkdd-interpretation,
  title     = {{Interpretation Attacks and Defenses on Predictive Models Using Electronic Health Records}},
  author    = {Razmi, Fereshteh and Lou, Jian and Hong, Yuan and Xiong, Li},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2023},
  pages     = {446-461},
  doi       = {10.1007/978-3-031-43418-1_27},
  url       = {https://mlanthology.org/ecmlpkdd/2023/razmi2023ecmlpkdd-interpretation/}
}