Bilateral Information-Aware Test-Time Adaptation for Vision-Language Models

Abstract

Test-time adaptation (TTA) fine-tunes models using new data encountered during inference, which enables the vision-language models to handle test data with covariant shifts. Unlike training-time adaptation, TTA does not require a test-distributed validation set or consider the worst-case distribution within a given tolerance. However, previous methods primarily focused on adaption-objective design, while the data tend to be fully utilized or simply filtered through a fixed low-entropy selection criteria. In this paper, we analyze the weakness of previous selection criterion and find that only selecting fixed proportion of low-entropy samples fails to ensure optimal performance across various datasets and can lead the model to becoming over-confident in wrongly classified samples, showing unexpected overfitting to atypical features and compromising effective adaptation. To improve upon them, we propose \textit{Bilateral Information-aware Test-Time Adaptation} (BITTA), which simultaneously leverages two distinct parts of the test inputs during adaptation. Specifically, a dynamic proportion of low-entropy samples are used to learn the core representation under covariant shifts, while high-entropy samples are adopted to unlearn atypical features. This dual approach prevents the model from undesired memorization and ensures extensive optimal performance. Comprehensive experiments validate the effectiveness in various datasets and model architectures. The code is publicly available at: https://github.com/tmlr-group/BITTA.

Cite

Text

Sun et al. "Bilateral Information-Aware Test-Time Adaptation for Vision-Language Models." International Conference on Learning Representations, 2026.

Markdown

[Sun et al. "Bilateral Information-Aware Test-Time Adaptation for Vision-Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/sun2026iclr-bilateral/)

BibTeX

@inproceedings{sun2026iclr-bilateral,
  title     = {{Bilateral Information-Aware Test-Time Adaptation for Vision-Language Models}},
  author    = {Sun, Jingwei and Zhu, Jianing and Yao, Jiangchao and Niu, Gang and Sugiyama, Masashi and Han, Bo},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/sun2026iclr-bilateral/}
}