Understanding and Reducing the Class-Dependent Effects of Data Augmentation with a Two-Player Game Approach

Abstract

Data augmentation is widely applied and has shown its benefits in different machine learning tasks. However, as recently observed, it may have an unfair effect in multi-class classification. While data augmentation generally improves the overall performance (and therefore is beneficial for many classes), it can actually be detrimental for other classes, which can be problematic in some application domains. In this paper, to counteract this phenomenon, we propose CLAM, a CLAss-dependent Multiplicative-weights method. To derive it, we first formulate the training of a classifier as a non-linear optimization problem that aims at simultaneously maximizing the individual class performances and balancing them. By rewriting this optimization problem as an adversarial two-player game, we propose a novel multiplicative weight algorithm, for which we prove the convergence. Interestingly, our formulation also reveals that the class-dependent effects of data augmentation is not due to data augmentation only, but is in fact a general phenomenon. Our empirical results over five datasets demonstrate that the performance of learned classifiers is indeed more fairly distributed over classes, with only limited impact on the average accuracy.

Cite

Text

Jiang et al. "Understanding and Reducing the Class-Dependent Effects of Data Augmentation with a Two-Player Game Approach." Transactions on Machine Learning Research, 2025.

Markdown

[Jiang et al. "Understanding and Reducing the Class-Dependent Effects of Data Augmentation with a Two-Player Game Approach." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/jiang2025tmlr-understanding/)

BibTeX

@article{jiang2025tmlr-understanding,
  title     = {{Understanding and Reducing the Class-Dependent Effects of Data Augmentation with a Two-Player Game Approach}},
  author    = {Jiang, Yunpeng and Ban, Yutong and Weng, Paul},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/jiang2025tmlr-understanding/}
}