Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability

Abstract

The field of computer vision was initially inspired by the human visual system and has progressively expanded to include a broader range of machine vision applications. Consequently, image compressors should be designed to effectively accommodate not only human visual perception but also machine vision tasks, including closed-set scenarios that enable pre-training and open-set scenarios that involve previously unseen tasks at test time. Many recent studies effectively address both human visual perception and closed-set machine vision tasks simultaneously but struggle to handle open-set machine vision tasks. To address this issue, this paper proposes a fully instance-specific test time fine-tuning (TTFT) for adapting learned image compression (LIC) to both closed-set and open-set machine vision tasks effectively. With our method, a large-scale LIC model, originally trained for human perception, is adapted to the target task through TTFT using Singular Value Decomposition based Low Rank Adaptation (SVD-LoRA). During TTFT, the decoder adopts a modified learning scheme that focuses exclusively on training the singular values, which helps prevent excessive bitstream overhead. This enables fully instance-specific optimization for the target task, even for open-set tasks. Experimental results demonstrate that the proposed method effectively adapts the backbone compressor to diverse machine vision tasks, outperforming competing methods.

Cite

Text

Park et al. "Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.00418

Markdown

[Park et al. "Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/park2025cvpr-testtime/) doi:10.1109/CVPR52734.2025.00418

BibTeX

@inproceedings{park2025cvpr-testtime,
  title     = {{Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability}},
  author    = {Park, Unki and Jeong, Seongmoon and Jang, Youngchan and Park, Gyeong-Moon and Ko, Jong Hwan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {4430-4440},
  doi       = {10.1109/CVPR52734.2025.00418},
  url       = {https://mlanthology.org/cvpr/2025/park2025cvpr-testtime/}
}