Two-Temperature Logistic Regression Based on the Tsallis Divergence
Abstract
We develop a variant of multiclass logistic regression that is significantly more robust to noise. The algorithm has one weight vector per class and the surrogate loss is a function of the linear activations (one per class). The surrogate loss of an example with linear activation vector $\mathbf{a}$ and class $c$ has the form $-\log_{t_1} \exp_{t_2} (a_c - G_{t_2}(\mathbf{a}))$ where the two temperatures $t_1$ and $t_2$ “temper” the $\log$ and $\exp$, respectively, and $G_{t_2}(\mathbf{a})$ is a scalar value that generalizes the log-partition function. We motivate this loss using the Tsallis divergence. Our method allows transitioning between non-convex and convex losses by the choice of the temperature parameters. As the temperature $t_1$ of the logarithm becomes smaller than the temperature $t_2$ of the exponential, the surrogate loss becomes “quasi convex”. Various tunings of the temperatures recover previous methods and tuning the degree of non-convexity is crucial in the experiments. In particular, quasi-convexity and boundedness of the loss provide significant robustness to the outliers. We explain this by showing that $t_1 < 1$ caps the surrogate loss and $t_2 >1$ makes the predictive distribution have a heavy tail. We show that the surrogate loss is Bayes-consistent, even in the non-convex case. Additionally, we provide efficient iterative algorithms for calculating the log-partition value only in a few number of iterations. Our compelling experimental results on large real-world datasets show the advantage of using the two-temperature variant in the noisy as well as the noise free case.
Cite
Text
Amid et al. "Two-Temperature Logistic Regression Based on the Tsallis Divergence." Artificial Intelligence and Statistics, 2019.Markdown
[Amid et al. "Two-Temperature Logistic Regression Based on the Tsallis Divergence." Artificial Intelligence and Statistics, 2019.](https://mlanthology.org/aistats/2019/amid2019aistats-twotemperature/)BibTeX
@inproceedings{amid2019aistats-twotemperature,
title = {{Two-Temperature Logistic Regression Based on the Tsallis Divergence}},
author = {Amid, Ehsan and Warmuth, Manfred K. and Srinivasan, Sriram},
booktitle = {Artificial Intelligence and Statistics},
year = {2019},
pages = {2388-2396},
volume = {89},
url = {https://mlanthology.org/aistats/2019/amid2019aistats-twotemperature/}
}