RDM-DC: Poisoning Resilient Dataset Condensation with Robust Distribution Matching
Abstract
Dataset condensation aims to condense the original training dataset into a small synthetic dataset for data-efficient learning. The recently proposed dataset condensation techniques allow the model trainers with limited resources to learn acceptable deep learning models on a small amount of synthetic data. However, in an adversarial environment, given the original dataset as a poisoned dataset, dataset condensation may encode the poisoning information into the condensed synthetic dataset. To explore the vulnerability of dataset condensation to data poisoning, we revisit the state-of-the-art targeted data poisoning method and customize a targeted data poisoning algorithm for dataset condensation. By executing the two poisoning methods, we demonstrate that, when the synthetic dataset is condensed from a poisoned dataset, the models trained on the synthetic dataset may predict the targeted sample as the attack-targeted label. To defend against data poisoning, we introduce the concept of poisoned deviation to quantify the poisoning effect. We further propose a poisoning-resilient dataset condensation algorithm with a calibration method to reduce poisoned deviation. Extensive evaluations demonstrate that our proposed algorithm can protect the synthetic dataset from data poisoning with minor performance drop.
Cite
Text
Zheng and Li. "RDM-DC: Poisoning Resilient Dataset Condensation with Robust Distribution Matching." Uncertainty in Artificial Intelligence, 2023.Markdown
[Zheng and Li. "RDM-DC: Poisoning Resilient Dataset Condensation with Robust Distribution Matching." Uncertainty in Artificial Intelligence, 2023.](https://mlanthology.org/uai/2023/zheng2023uai-rdmdc/)BibTeX
@inproceedings{zheng2023uai-rdmdc,
title = {{RDM-DC: Poisoning Resilient Dataset Condensation with Robust Distribution Matching}},
author = {Zheng, Tianhang and Li, Baochun},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2023},
pages = {2541-2550},
volume = {216},
url = {https://mlanthology.org/uai/2023/zheng2023uai-rdmdc/}
}