Error Profiling of Machine Learning Models: An Exploratory Visualization

Abstract

While data-driven predictive models are increasingly used in healthcare, their clinical translation remains limited—partly due to challenges in evaluating model performance across design choices. Existing explainability methods often focus on intra-model interpretability but fall short in supporting inter-model comparisons. We present a visualization-based error profiling method that facilitates comparative evaluation by highlighting overlaps and differences in model predictions. Our matrix-based visualization maps which models incorrectly classify which patient subgroups, with color intensity indicating the number of misclassified patients. This approach enables deeper insight into which (sub)populations are consistently (in)correctly classified across models, helping uncover patterns of model (dis)agreement and assess the impact of modeling decisions. We demonstrate our visualization method in four healthcare use cases: 1) missing data imputation in a longitudinal nutritional dataset; 2) feature set analysis using randomized controlled trial data; 3) end-model technical performance in cardiac morbidity prediction; and 4) data modality comparison using a dual-source lung cancer dataset with longitudinal and radiomic features. To evaluate the visualization, we obtained expert feedback and qualitative assessments of decision-making insights. Survey results—across clinicians, computer scientists, and medical informaticians—indicated that our method provides an interpretable and intuitive way to compare model error distributions by highlighting patterns within correctly and incorrectly classified subpopulations across different models. Our comprehensible error profiling approach represents an initial step toward a systematic framework for improving model assessment in clinical tasks. Through this framework, both model developers and end users can better understand when and where a given model is appropriate for real-world clinical deployment.

Cite

Text

Feng et al. "Error Profiling of Machine Learning Models: An Exploratory Visualization." Proceedings of the 10th Machine Learning for Healthcare Conference, 2025.

Markdown

[Feng et al. "Error Profiling of Machine Learning Models: An Exploratory Visualization." Proceedings of the 10th Machine Learning for Healthcare Conference, 2025.](https://mlanthology.org/mlhc/2025/feng2025mlhc-error/)

BibTeX

@inproceedings{feng2025mlhc-error,
  title     = {{Error Profiling of Machine Learning Models: An Exploratory Visualization}},
  author    = {Feng, Jeffrey and Rahrooh, Al and Bui, Alex},
  booktitle = {Proceedings of the 10th Machine Learning for Healthcare Conference},
  year      = {2025},
  volume    = {298},
  url       = {https://mlanthology.org/mlhc/2025/feng2025mlhc-error/}
}