The Surprising Amount of Arbitrariness in Shapley-Value Data Valuation

Abstract

The growing economic importance of data has generated interest in principled methods for data valuation. Particular attention has been given to the Shapley value, a result from cooperative game theory that defines the unique distribution of a game's rewards to contributors subject to specified fairness axioms. By casting a machine learning task as a cooperative game, Shapley-based data valuation purports to equitably attribute model performance to individuals. However, the practical operationalization of this process depends on a wide array of practitioner decisions. Many of these decisions lie outside of the scope of the underlying machine learning task, introducing a potential for arbitrary decision making. The sensitivity of valuation outcomes to these intermediate decisions threatens the desired fairness properties. In light of these surfaced concerns, we evaluate the face-value equitability of Shapley for data valuation.

Cite

Text

Diehl and Wilson. "The Surprising Amount of Arbitrariness in Shapley-Value Data Valuation." ICLR 2025 Workshops: Data_Problems, 2025.

Markdown

[Diehl and Wilson. "The Surprising Amount of Arbitrariness in Shapley-Value Data Valuation." ICLR 2025 Workshops: Data_Problems, 2025.](https://mlanthology.org/iclrw/2025/diehl2025iclrw-surprising/)

BibTeX

@inproceedings{diehl2025iclrw-surprising,
  title     = {{The Surprising Amount of Arbitrariness in Shapley-Value Data Valuation}},
  author    = {Diehl, Hannah and Wilson, Ashia C.},
  booktitle = {ICLR 2025 Workshops: Data_Problems},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/diehl2025iclrw-surprising/}
}