Filtering Instances and Rejecting Predictions to Obtain Reliable Models in Healthcare

Maria Gabriela Valeriano, David Kohan Marzagão, Alfredo Montelongo, Carlos Roberto Veiga Kiffer, Natan Katz, Ana Carolina Lorena

MLJ 2026 pp. 15

doi:10.1007/S10994-025-06941-8 /mlj/2026/valeriano2026mlj-filtering/

Abstract

Machine Learning (ML) models are widely used in high-stakes domains such as healthcare, where the reliability of predictions is critical. However, these models often fail to account for uncertainty, providing predictions even with low confidence. This work proposes a novel two-step data-centric approach to enhance the performance of ML models by improving data quality and filtering low-confidence predictions. The first step involves leveraging Instance Hardness (IH) to filter problematic instances during training, thereby refining the dataset. The second step introduces a confidence-based rejection mechanism during inference, ensuring that only reliable predictions are retained. We evaluate our approach using three real-world healthcare datasets, demonstrating its effectiveness at improving model reliability while balancing predictive performance and rejection rate. Additionally, we use alternative criteria−influence values for filtering and uncertainty for rejection−as baselines to evaluate the efficiency of the proposed method. The results demonstrate that integrating IH filtering with confidence-based rejection effectively enhances model performance while preserving a large proportion of instances. This approach provides a practical method for deploying ML systems in safety-critical applications.

PDF MLJ Semantic Scholar

Cite

Text

Valeriano et al. "Filtering Instances and Rejecting Predictions to Obtain Reliable Models in Healthcare." Machine Learning, 2026. doi:10.1007/S10994-025-06941-8

Markdown

[Valeriano et al. "Filtering Instances and Rejecting Predictions to Obtain Reliable Models in Healthcare." Machine Learning, 2026.](https://mlanthology.org/mlj/2026/valeriano2026mlj-filtering/) doi:10.1007/S10994-025-06941-8

BibTeX

@article{valeriano2026mlj-filtering,
  title     = {{Filtering Instances and Rejecting Predictions to Obtain Reliable Models in Healthcare}},
  author    = {Valeriano, Maria Gabriela and Marzagão, David Kohan and Montelongo, Alfredo and Kiffer, Carlos Roberto Veiga and Katz, Natan and Lorena, Ana Carolina},
  journal   = {Machine Learning},
  year      = {2026},
  pages     = {15},
  doi       = {10.1007/S10994-025-06941-8},
  volume    = {115},
  url       = {https://mlanthology.org/mlj/2026/valeriano2026mlj-filtering/}
}