The Surprising Amount of Arbitrariness in Shapley-Value Data Valuation
Abstract
The growing economic importance of data has generated interest in principled methods for data valuation. Particular attention has been given to the Shapley value, a result from cooperative game theory that defines the unique distribution of a game's rewards to contributors subject to specified fairness axioms. By casting a machine learning task as a cooperative game, Shapley-based data valuation purports to equitably attribute model performance to individuals. However, the practical operationalization of this process depends on a wide array of practitioner decisions. Many of these decisions lie outside of the scope of the underlying machine learning task, introducing a potential for arbitrary decision making. The sensitivity of valuation outcomes to these intermediate decisions threatens the desired fairness properties. In light of these surfaced concerns, we evaluate the face-value equitability of Shapley for data valuation.
Cite
Text
Diehl and Wilson. "The Surprising Amount of Arbitrariness in Shapley-Value Data Valuation." ICLR 2025 Workshops: Data_Problems, 2025.Markdown
[Diehl and Wilson. "The Surprising Amount of Arbitrariness in Shapley-Value Data Valuation." ICLR 2025 Workshops: Data_Problems, 2025.](https://mlanthology.org/iclrw/2025/diehl2025iclrw-surprising/)BibTeX
@inproceedings{diehl2025iclrw-surprising,
title = {{The Surprising Amount of Arbitrariness in Shapley-Value Data Valuation}},
author = {Diehl, Hannah and Wilson, Ashia C.},
booktitle = {ICLR 2025 Workshops: Data_Problems},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/diehl2025iclrw-surprising/}
}