ImageGem: In-the-Wild Generative Image Interaction Dataset for Generative Model Personalization
Abstract
We introduce ImageGem, a dataset for studying generative models that understand fine-grained individual preferences. We posit that a key challenge hindering the development of such a generative model is the lack of in-the-wild and fine-grained user preference annotations. Our dataset features real-world interaction data from 57K users, who collectively have built 242K customized LoRAs, written 3M text prompts, and created 5M generated images. With user preference annotations from our dataset, we were able to train better preference alignment models. In addition, leveraging individual user preference, we investigated the performance of retrieval models and a vision-language model on personalized image retrieval and generative model recommendation. Finally, we propose an end-to-end framework for editing customized diffusion models in a latent weight space to align with individual user preferences. Our results demonstrate that the ImageGem dataset enables, for the first time, a new paradigm for generative model personalization.
Cite
Text
Guo et al. "ImageGem: In-the-Wild Generative Image Interaction Dataset for Generative Model Personalization." International Conference on Computer Vision, 2025.Markdown
[Guo et al. "ImageGem: In-the-Wild Generative Image Interaction Dataset for Generative Model Personalization." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/guo2025iccv-imagegem/)BibTeX
@inproceedings{guo2025iccv-imagegem,
title = {{ImageGem: In-the-Wild Generative Image Interaction Dataset for Generative Model Personalization}},
author = {Guo, Yuanhe and Xie, Linxi and Chen, Zhuoran and Yu, Kangrui and Po, Ryan and Yang, Guandao and Wetzstein, Gordon and Wen, Hongyi},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {19577-19586},
url = {https://mlanthology.org/iccv/2025/guo2025iccv-imagegem/}
}