T-COL: Generating Counterfactual Explanations for General User Preferences on Variable Machine Learning Systems

Abstract

To address the interpretability challenge in machine learning (ML) systems, counterfactual explanations (CEs) have emerged as a promising solution. CEs are unique as they provide workable suggestions to users, instead of explaining why a certain outcome was predicted. The application of CEs encounters two main challenges: general user preferences and variable ML systems. On one hand, user preferences for specific values can vary depending on the task and scenario. On the other hand, the ML systems for verification may change while the CEs are performed. Thus, user preferences tend to be general rather than specific, and CEs need to be adaptable to variable ML models while maintaining robustness even as these models change. Facing these challenges, we propose general user preferences based on insights from psychology and behavioral science, and add the challenge of non-static ML systems as one preference. Moreover, we introduce a novel method, Tree-based Conditions Optional Links (T-COL) for generating CEs adaptable to general user preferences. Moreover, we employ T-COL to enhance the robustness of CEs with specific conditions, making CEs robust even when the ML models are replaced. To assess subjectivity preferences, we define LLM-based autonomous agents to simulate users and align them with real users. Experiments show that T-COL outperforms all baselines in adapting to general user preferences.

Cite

Text

Wang et al. "T-COL: Generating Counterfactual Explanations for General User Preferences on Variable Machine Learning Systems." Journal of Artificial Intelligence Research, 2026. doi:10.1613/JAIR.1.18166

Markdown

[Wang et al. "T-COL: Generating Counterfactual Explanations for General User Preferences on Variable Machine Learning Systems." Journal of Artificial Intelligence Research, 2026.](https://mlanthology.org/jair/2026/wang2026jair-tcol/) doi:10.1613/JAIR.1.18166

BibTeX

@article{wang2026jair-tcol,
  title     = {{T-COL: Generating Counterfactual Explanations for General User Preferences on Variable Machine Learning Systems}},
  author    = {Wang, Ming and Wang, Daling and Wu, Wenfang and Feng, Shi and Zhang, Yifei},
  journal   = {Journal of Artificial Intelligence Research},
  year      = {2026},
  doi       = {10.1613/JAIR.1.18166},
  volume    = {85},
  url       = {https://mlanthology.org/jair/2026/wang2026jair-tcol/}
}