CLAIRE: Clustering Evaluation Based on Item Response Theory and Model Agreement

Junior, Manuel Ferreira; de Andrade Lima Neto, Eufrásio; Ferreira, Marcelo Rodrigo Portela; de Menezes e Silva Filho, Telmo; Prudêncio, Ricardo B. C.

doi:10.1007/S10994-025-06911-0

CLAIRE: Clustering Evaluation Based on Item Response Theory and Model Agreement

Manuel Ferreira Junior, Eufrásio de Andrade Lima Neto, Marcelo Rodrigo Portela Ferreira, Telmo de Menezes e Silva Filho, Ricardo B. C. Prudêncio

MLJ 2025 pp. 256

doi:10.1007/S10994-025-06911-0 /mlj/2025/junior2025mlj-claire/

Abstract

Clustering evaluation is a complex task. External measures, such as the Rand index, are often used for benchmarking, but they are not applicable in real, unsupervised scenarios due to the lack of ground truth. Thus, we often turn to internal measures for model evaluation, e.g. silhouette, Dunn, and Davies-Bouldin. These indexes have the advantage of evaluating models based on the clustered data points themselves, however, they rely on a chosen distance, so they are only meaningful for models that use the same distance. Additionally, they fail if all instances are assigned to a single cluster and they aim to evaluate separation and cohesion instead of quantifying a model’s ability to recover any underlying classes. Thus, internal measures are not suited to compare models that are estimated differently. In this paper, we propose CLAIRE (CLuster Agreement-based Item REsponses), a method for global evaluation of clustering models, by assuming that good models agree on whether pairs of instances should be clustered together or not. We leverage Item Response Theory to estimate model ability and instance difficulty, using response matrices obtained by measuring the agreement between models. Experiments were carried out using diverse sets of clustering methods and datasets with different numbers of clusters and varying shapes and levels of overlapping and noise. Results show that CLAIRE is robust to the presence of random partitions in the pool of models and correctly ranks models across the many tested scenarios with a surprisingly high correlation with external measures of clustering quality, meaning it also indirectly evaluates the recovery of underlying classes.

PDF MLJ Semantic Scholar

Cite

Text

Junior et al. "CLAIRE: Clustering Evaluation Based on Item Response Theory and Model Agreement." Machine Learning, 2025. doi:10.1007/S10994-025-06911-0

Markdown

[Junior et al. "CLAIRE: Clustering Evaluation Based on Item Response Theory and Model Agreement." Machine Learning, 2025.](https://mlanthology.org/mlj/2025/junior2025mlj-claire/) doi:10.1007/S10994-025-06911-0

BibTeX

@article{junior2025mlj-claire,
  title     = {{CLAIRE: Clustering Evaluation Based on Item Response Theory and Model Agreement}},
  author    = {Junior, Manuel Ferreira and de Andrade Lima Neto, Eufrásio and Ferreira, Marcelo Rodrigo Portela and de Menezes e Silva Filho, Telmo and Prudêncio, Ricardo B. C.},
  journal   = {Machine Learning},
  year      = {2025},
  pages     = {256},
  doi       = {10.1007/S10994-025-06911-0},
  volume    = {114},
  url       = {https://mlanthology.org/mlj/2025/junior2025mlj-claire/}
}