IC-Custom: Diverse Image Customization via In-Context Learning

Li, Yaowei; Li, Xiaoyu; Zhang, Zhaoyang; Bian, Yuxuan; Liu, Gan; Li, Xinyuan; Xu, Jiale; Hu, Wenbo; Liu, Yating; Li, Lingen; Cai, Jing; Zou, Yuexian; He, Yancheng; Shan, Ying

IC-Custom: Diverse Image Customization via In-Context Learning

Yaowei Li, Xiaoyu Li, Zhaoyang Zhang, Yuxuan Bian, Gan Liu, Xinyuan Li, Jiale Xu, Wenbo Hu, Yating Liu, Lingen Li, Jing Cai, Yuexian Zou, Yancheng He, Ying Shan

ICLR 2026

/iclr/2026/li2026iclr-iccustom/

Abstract

Image customization, a crucial technique for industrial media production, aims to generate content that is consistent with reference images. However, current approaches conventionally separate image customization into position-aware and position-free customization paradigms and lack a universal framework for diverse customization, limiting their applications across various scenarios. To overcome these limitations, we propose IC-Custom, a unified framework that seamlessly integrates position-aware and position-free image customization through in-context learning. IC-Custom concatenates reference images with target images to a polyptych, leveraging DiT's multi-modal attention mechanism for fine-grained token-level interactions. We propose the In-context Multi-Modal Attention (ICMA) mechanism, which employs learnable task-oriented register tokens and boundary-aware positional embeddings to enable the model to effectively handle diverse tasks and distinguish between inputs in polyptych configurations. To address the data gap, we curated a 12K identity-consistent dataset with 8K real-world and 4K high-quality synthetic samples, avoiding the overly glossy, oversaturated look typical of synthetic data. IC-Custom supports various industrial applications, including try-on, image insertion, and creative IP customization. Extensive evaluations on our proposed ProductBench and the publicly available DreamBench demonstrate that IC-Custom significantly outperforms community workflows, closed-source models, and state-of-the-art open-source approaches. IC-Custom achieves about 73\% higher human preference across identity consistency, harmony, and text alignment metrics, while training only 0.4\% of the original model parameters.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Li et al. "IC-Custom: Diverse Image Customization via In-Context Learning." International Conference on Learning Representations, 2026.

Markdown

[Li et al. "IC-Custom: Diverse Image Customization via In-Context Learning." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/li2026iclr-iccustom/)

BibTeX

@inproceedings{li2026iclr-iccustom,
  title     = {{IC-Custom: Diverse Image Customization via In-Context Learning}},
  author    = {Li, Yaowei and Li, Xiaoyu and Zhang, Zhaoyang and Bian, Yuxuan and Liu, Gan and Li, Xinyuan and Xu, Jiale and Hu, Wenbo and Liu, Yating and Li, Lingen and Cai, Jing and Zou, Yuexian and He, Yancheng and Shan, Ying},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/li2026iclr-iccustom/}
}