The Unreasonable Effectiveness of Large Language-Vision Models for Source-Free Video Domain Adaptation

Abstract

Source-Free Video Unsupervised Domain Adaptation (SFVUDA) task consists in adapting an action recognition model, trained on a labelled source dataset, to an unlabelled target dataset, without accessing the actual source data. The previous approaches have attempted to address SFVUDA by leveraging self-supervision (e.g., enforcing temporal consistency) derived from the target data itself. In this work, we take an orthogonal approach by exploiting "web-supervision" from Large Language-Vision Models (LLVMs), driven by the rationale that LLVMs contain a rich world prior surprisingly robust to domain-shift. We showcase the unreasonable effectiveness of integrating LLVMs for SFVUDA by devising an intuitive and parameter-efficient method, which we name Domain Adaptation with Large Language-Vision models (DALL-V), that distills the world prior and complementary source model information into a student network tailored for the target. Despite the simplicity, DALL-V achieves significant improvement over state-of-the-art SFVUDA methods.

Cite

Text

Zara et al. "The Unreasonable Effectiveness of Large Language-Vision Models for Source-Free Video Domain Adaptation." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00946

Markdown

[Zara et al. "The Unreasonable Effectiveness of Large Language-Vision Models for Source-Free Video Domain Adaptation." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/zara2023iccv-unreasonable/) doi:10.1109/ICCV51070.2023.00946

BibTeX

@inproceedings{zara2023iccv-unreasonable,
  title     = {{The Unreasonable Effectiveness of Large Language-Vision Models for Source-Free Video Domain Adaptation}},
  author    = {Zara, Giacomo and Conti, Alessandro and Roy, Subhankar and Lathuilière, Stéphane and Rota, Paolo and Ricci, Elisa},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {10307-10317},
  doi       = {10.1109/ICCV51070.2023.00946},
  url       = {https://mlanthology.org/iccv/2023/zara2023iccv-unreasonable/}
}