TACO Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression

Abstract

Recent vision architectures and self-supervised training methods have enabled training computer vision models that are extremely accurate, but come with massive computational costs. In settings such as identifying species in camera traps in the field, users have limited resources, and may fine-tune a pretrained model on (often limited) data from a small set of specific categories of interest. Such users may still wish to make use of highly-accurate large models, but are often constrained by the computational cost. To address this, we ask: can we quickly compress generalist models into accurate and efficient specialists given a small amount of data? Towards this goal, we propose a simple and versatile technique, which we call Few-Shot Task-Aware COmpression (TACO). Given a general-purpose model pretrained on a broad task, such as classification on ImageNet or iNaturalist datasets with thousands of categories, TACO produces a much smaller model that is accurate on specialized tasks, such as classifying across vehicle types or animal species, based only on a few examples from each target class. The method is based on two key insights - 1) a powerful specialization effect for data-aware compression, which we showcase for the first time; 2) a dedicated finetuning procedure with knowledge distillation, which prevents overfitting even in scenarios where data is very scarce. Specifically, TACO is applied in few-shot fashion, i.e. only a few task-specific samples are used for compression, and the procedure has low computational overhead. We validate this approach experimentally using highly-accurate ResNet, ViT/DeiT, and ConvNeXt models, originally trained on ImageNet and iNaturalist datasets, which we specialize and compress to a diverse set of ``downstream'' subtasks, with notable computational speedups on both CPU and GPU.

Cite

Text

Kuznedelev et al. "TACO Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression." Transactions on Machine Learning Research, 2025.

Markdown

[Kuznedelev et al. "TACO Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/kuznedelev2025tmlr-taco/)

BibTeX

@article{kuznedelev2025tmlr-taco,
  title     = {{TACO Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression}},
  author    = {Kuznedelev, Denis and Tabesh, Soroush and Noorbakhsh, Kimia and Frantar, Elias and Beery, Sara and Kurtic, Eldar and Alistarh, Dan},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/kuznedelev2025tmlr-taco/}
}