[Re] Classwise-Shapley Values for Data Valuation

Abstract

We evaluate CS-Shapley, a data valuation method introduced in Schoch et al. (2022) for classification problems. We repeat the experiments in the paper, including two additional methods, the Least Core (Yan & Procaccia, 2021) and Data Banzhaf (Wang & Jia, 2023), a comparison not found in the literature. We include more conservative error estimates and additional metrics, like rank stability, and a variance-corrected version of Weighted Accuracy Drop, originally introduced in Schoch et al. (2022). We conclude that while CS-Shapley helps in the scenarios it was originally tested in, in particular for the detection of corrupted labels, it is outperformed by the conceptually simpler Data Banzhaf in the task of detecting highly influential points.

Cite

Text

Semmler and de Benito Delgado. "[Re] Classwise-Shapley Values for Data Valuation." Transactions on Machine Learning Research, 2024.

Markdown

[Semmler and de Benito Delgado. "[Re] Classwise-Shapley Values for Data Valuation." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/semmler2024tmlr-re/)

BibTeX

@article{semmler2024tmlr-re,
  title     = {{[Re] Classwise-Shapley Values for Data Valuation}},
  author    = {Semmler, Markus and de Benito Delgado, Miguel},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/semmler2024tmlr-re/}
}