Decoupled Kullback-Leibler Divergence Loss

Jiequan Cui, Zhuotao Tian, Zhisheng Zhong, Xiaojuan Qi, Bei Yu, Hanwang Zhang

NeurIPS 2024

doi:10.52202/079017-2370 /neurips/2024/cui2024neurips-decoupled/

Abstract

In this paper, we delve deeper into the Kullback–Leibler (KL) Divergence loss and mathematically prove that it is equivalent to the Decoupled Kullback-Leibler (DKL) Divergence loss that consists of 1) a weighted Mean Square Error ($\mathbf{w}$MSE) loss and 2) a Cross-Entropy loss incorporating soft labels. Thanks to the decomposed formulation of DKL loss, we have identified two areas for improvement. Firstly, we address the limitation of KL/DKL in scenarios like knowledge distillation by breaking its asymmetric optimization property. This modification ensures that the $\mathbf{w}$MSE component is always effective during training, providing extra constructive cues.Secondly, we introduce class-wise global information into KL/DKL to mitigate bias from individual samples.With these two enhancements, we derive the Improved Kullback–Leibler (IKL) Divergence loss and evaluate its effectiveness by conducting experiments on CIFAR-10/100 and ImageNet datasets, focusing on adversarial training, and knowledge distillation tasks. The proposed approach achieves new state-of-the-art adversarial robustness on the public leaderboard --- \textit{RobustBench} and competitive performance on knowledge distillation, demonstrating the substantial practical merits. Our code is available at https://github.com/jiequancui/DKL.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Cui et al. "Decoupled Kullback-Leibler Divergence Loss." Neural Information Processing Systems, 2024. doi:10.52202/079017-2370

Markdown

[Cui et al. "Decoupled Kullback-Leibler Divergence Loss." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/cui2024neurips-decoupled/) doi:10.52202/079017-2370

BibTeX

@inproceedings{cui2024neurips-decoupled,
  title     = {{Decoupled Kullback-Leibler Divergence Loss}},
  author    = {Cui, Jiequan and Tian, Zhuotao and Zhong, Zhisheng and Qi, Xiaojuan and Yu, Bei and Zhang, Hanwang},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-2370},
  url       = {https://mlanthology.org/neurips/2024/cui2024neurips-decoupled/}
}