Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability
Abstract
The field of computer vision was initially inspired by the human visual system and has progressively expanded to include a broader range of machine vision applications. Consequently, image compressors should be designed to effectively accommodate not only human visual perception but also machine vision tasks, including closed-set scenarios that enable pre-training and open-set scenarios that involve previously unseen tasks at test time. Many recent studies effectively address both human visual perception and closed-set machine vision tasks simultaneously but struggle to handle open-set machine vision tasks. To address this issue, this paper proposes a fully instance-specific test time fine-tuning (TTFT) for adapting learned image compression (LIC) to both closed-set and open-set machine vision tasks effectively. With our method, a large-scale LIC model, originally trained for human perception, is adapted to the target task through TTFT using Singular Value Decomposition based Low Rank Adaptation (SVD-LoRA). During TTFT, the decoder adopts a modified learning scheme that focuses exclusively on training the singular values, which helps prevent excessive bitstream overhead. This enables fully instance-specific optimization for the target task, even for open-set tasks. Experimental results demonstrate that the proposed method effectively adapts the backbone compressor to diverse machine vision tasks, outperforming competing methods.
Cite
Text
Park et al. "Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.00418Markdown
[Park et al. "Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/park2025cvpr-testtime/) doi:10.1109/CVPR52734.2025.00418BibTeX
@inproceedings{park2025cvpr-testtime,
title = {{Test-Time Fine-Tuning of Image Compression Models for Multi-Task Adaptability}},
author = {Park, Unki and Jeong, Seongmoon and Jang, Youngchan and Park, Gyeong-Moon and Ko, Jong Hwan},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2025},
pages = {4430-4440},
doi = {10.1109/CVPR52734.2025.00418},
url = {https://mlanthology.org/cvpr/2025/park2025cvpr-testtime/}
}