Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations
Abstract
Spatially dense self-supervised learning is a rapidly growing problem domain with promising applications for unsupervised segmentation and pretraining for dense downstream tasks. Despite the abundance of temporal data in the form of videos, this information-rich source has been largely overlooked. Our paper aims to address this gap by proposing a novel approach that incorporates temporal consistency in dense self-supervised learning. While methods designed solely for images face difficulties in achieving even the same performance on videos, our method improves not only the representation quality for videos - but also images. Our approach, which we call time-tuning, starts from image-pretrained models and fine-tunes them with a novel self-supervised temporal-alignment clustering loss on unlabeled videos. This effectively facilitates the transfer of high-level information from videos to image representations. Time-tuning improves the state-of-the-art by 8-10% for unsupervised semantic segmentation on videos and matches it for images. We believe this method paves the way for further self-supervised scaling by leveraging the abundant availability of videos. The implementation can be found here : https://github.com/SMSD75/Timetuning
Cite
Text
Salehi et al. "Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01516Markdown
[Salehi et al. "Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/salehi2023iccv-time/) doi:10.1109/ICCV51070.2023.01516BibTeX
@inproceedings{salehi2023iccv-time,
title = {{Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations}},
author = {Salehi, Mohammadreza and Gavves, Efstratios and Snoek, Cees G.M. and Asano, Yuki M.},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {16536-16547},
doi = {10.1109/ICCV51070.2023.01516},
url = {https://mlanthology.org/iccv/2023/salehi2023iccv-time/}
}