Learning from Uncertain Concepts via Test Time Interventions

Abstract

With neural networks applied to safety-critical applications, it has become increasingly important to understand the defining features of decision-making. Therefore, the need to uncover the black boxes to rational representational space of these neural networks is apparent. Concept bottleneck model (CBM) encourages interpretability by predicting human-understandable concepts. They predict concepts from input images and then labels from concepts. Test time intervention, a salient feature of CBM, allows for human-model interactions. However, these interactions are prone to information leakage and can often be ineffective inappropriate communication with humans. We propose a novel uncertainty based strategy, \emph{SIUL: Single Interventional Uncertainty Learning} to select the interventions. Additionally, we empirically test the robustness of CBM and the effect of SIUL interventions under adversarial attack and distributional shift. Using SIUL, we observe that the interventions suggested lead to meaningful corrections along with mitigation of concept leakage. Extensive experiments on three vision datasets along with a histopathology dataset validate the effectiveness of our interventional learning.

Cite

Text

Sheth et al. "Learning from Uncertain Concepts via Test Time Interventions." NeurIPS 2022 Workshops: TSRML, 2022.

Markdown

[Sheth et al. "Learning from Uncertain Concepts via Test Time Interventions." NeurIPS 2022 Workshops: TSRML, 2022.](https://mlanthology.org/neuripsw/2022/sheth2022neuripsw-learning/)

BibTeX

@inproceedings{sheth2022neuripsw-learning,
  title     = {{Learning from Uncertain Concepts via Test Time Interventions}},
  author    = {Sheth, Ivaxi and Rahman, Aamer Abdul and Sevyeri, Laya Rafiee and Havaei, Mohammad and Kahou, Samira Ebrahimi},
  booktitle = {NeurIPS 2022 Workshops: TSRML},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/sheth2022neuripsw-learning/}
}