GMValuator: Similarity-Based Data Valuation for Generative Models

Abstract

Data valuation plays a crucial role in machine learning. Existing data valuation methods, mainly focused on discriminative models, overlook generative models that have gained attention recently. In generative models, data valuation measures the impact of training data on generated datasets. Very few existing attempts at data valuation methods designed for deep generative models either concentrate on specific models or lack robustness in their outcomes. Moreover, efficiency still reveals vulnerable shortcomings. We formulate the data valuation problem in generative models from a similarity matching perspective to bridge the gaps. Specifically, we introduce Generative Model Valuator (GMValuator), the first training-free and model-agnostic approach to providing data valuation for image generation tasks. It empowers efficient data valuation through our innovative similarity matching module, calibrates biased contributions by incorporating image quality assessment, and attributes credits to all training samples based on their contributions to the generated samples. Additionally, we introduce four evaluation criteria for assessing data valuation methods in generative models. GMValuator is extensively evaluated on benchmark and high-resolution datasets and various mainstream generative architectures to demonstrate its effectiveness. Our code is available at: https://github.com/ubc-tea/GMValuator.

Cite

Text

Yang et al. "GMValuator: Similarity-Based Data Valuation for Generative Models." International Conference on Learning Representations, 2025.

Markdown

[Yang et al. "GMValuator: Similarity-Based Data Valuation for Generative Models." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/yang2025iclr-gmvaluator/)

BibTeX

@inproceedings{yang2025iclr-gmvaluator,
  title     = {{GMValuator: Similarity-Based Data Valuation for Generative Models}},
  author    = {Yang, Jiaxi and Deng, Wenlong and Liu, Benlin and Huang, Yangsibo and Zou, James and Li, Xiaoxiao},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/yang2025iclr-gmvaluator/}
}