Hystar: Hypernetwork-Driven Style-Adaptive Retrieval via Dynamic SVD Modulation
Abstract
Query-based image retrieval (QBIR) requires retrieving relevant images given diverse and often stylistically heterogeneous queries, such as sketches, artworks, or low-resolution previews. While large-scale vision--language representation models (VLRMs) like CLIP offer strong zero-shot retrieval performance, they struggle with distribution shifts caused by unseen query styles. In this paper, we propose the Hypernetwork-driven Style-adaptive Retrieval (Hystar), a lightweight framework that dynamically adapts model weights to each query’s style. Hystar employs a hypernetwork to generate singular-value perturbations ($\Delta S$) for attention layers, enabling flexible per-input adaptation, while static singular-value offsets on MLP layers ensure cross-style stability. To better handle semantic confusions across styles, we design StyleNCE as part of Hystar, an optimal-transport-weighted contrastive loss that emphasizes hard cross-style negatives. Extensive experiments on multi-style retrieval and cross-style classification benchmarks demonstrate that Hystar consistently outperforms strong baselines, achieving state-of-the-art performance while being parameter-efficient and stable across styles.
Cite
Text
Cai et al. "Hystar: Hypernetwork-Driven Style-Adaptive Retrieval via Dynamic SVD Modulation." International Conference on Learning Representations, 2026.Markdown
[Cai et al. "Hystar: Hypernetwork-Driven Style-Adaptive Retrieval via Dynamic SVD Modulation." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/cai2026iclr-hystar/)BibTeX
@inproceedings{cai2026iclr-hystar,
title = {{Hystar: Hypernetwork-Driven Style-Adaptive Retrieval via Dynamic SVD Modulation}},
author = {Cai, Yujia and Li, Boxuan and Xu, Chenghao and Yan, Jiexi},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/cai2026iclr-hystar/}
}