Improving Foundation Model Group Robustness with Auxiliary Sentence Embeddings

Abstract

This paper addresses the critical challenge of mitigating group-based biases in vision-language foundation models, a pressing issue for ensuring trustworthy AI deployment. We introduce DoubleCCA, a novel and computationally efficient framework that systematically enriches textual representations to enhance group robustness. Our key innovation is to leverage an auxiliary large sentence embedding model to capture diverse semantic perspectives, counteracting biased representations induced by limited training data. To this end, we propose a two-stage Canonical Correlation Analysis (DoubleCCA) technique: first, aligning augmented and original embeddings in a shared space; second, reconstructing invariant features to align with visual representations, thus enhancing the model's group robustness. We further propose a simple sentence augmentation approach, which aims to improve the robustness of CCA-induced subspaces. Our method is simple to implement and can be easily integrated into existing models, making it a practical solution for improving the robustness of vision-language foundation models to group-based biases. The experiments on a variety of datasets demonstrate that our method outperforms existing methods in terms of both performance and robustness. Our code is available at https://github.com/sisuolv/doublecca.

Cite

Text

Lyu et al. "Improving Foundation Model Group Robustness with Auxiliary Sentence Embeddings." Transactions on Machine Learning Research, 2026.

Markdown

[Lyu et al. "Improving Foundation Model Group Robustness with Auxiliary Sentence Embeddings." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/lyu2026tmlr-improving/)

BibTeX

@article{lyu2026tmlr-improving,
  title     = {{Improving Foundation Model Group Robustness with Auxiliary Sentence Embeddings}},
  author    = {Lyu, Sisuo and Liu, Hong and Li, Jie and Teng, Yan and Wang, Yingchun},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/lyu2026tmlr-improving/}
}