Asymmetric Duos: Sidekicks Improve Uncertainty
Abstract
The go-to strategy to apply deep networks in settings where uncertainty informs decisions—ensembling multiple training runs with random initializations—is ill-suited for the extremely large-scale models and practical fine-tuning workflows of today. We introduce a new cost-effective strategy for improving the uncertainty quantification and downstream decisions of a large model (e.g. a fine-tuned ViT-B): coupling it with a less accurate but much smaller "sidekick" (e.g. a fine-tuned ResNet-34) with a fraction of the computational cost. We propose aggregating the predictions of this *Asymmetric Duo* by simple learned weighted averaging. Surprisingly, despite their inherent asymmetry, the sidekick model almost never harms the performance of the larger model. In fact, across five image classification benchmarks, and a variety of model architectures and training schemes (including soups), Asymmetric Duos significantly improve accuracy, uncertainty quantification, and selective classification metrics with only ${\sim}10-20$% more computation. Code is available at: https://github.com/timgzhou/asymmetric-duos
Cite
Text
Zhou et al. "Asymmetric Duos: Sidekicks Improve Uncertainty." Advances in Neural Information Processing Systems, 2025.Markdown
[Zhou et al. "Asymmetric Duos: Sidekicks Improve Uncertainty." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/zhou2025neurips-asymmetric/)BibTeX
@inproceedings{zhou2025neurips-asymmetric,
title = {{Asymmetric Duos: Sidekicks Improve Uncertainty}},
author = {Zhou, Tim G. and Shelhamer, Evan and Pleiss, Geoff},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/zhou2025neurips-asymmetric/}
}