IPAdapter-Instruct: Resolving Ambiguity in Image-Based Conditioning Using Instruct Prompts

Abstract

Diffusion models continuously push the boundary of state-of-the-art image generation, but the process is hard to control with any nuance: practice proves that textual prompts are inadequate for accurately describing image style or fine structural details (such as faces). ControlNet [ 43 ] and IPAdapter [ 39 ] address this shortcoming by conditioning the generative process on imagery instead, but each individual instance is limited to modeling a single conditional posterior: for practical use-cases, where multiple different posteriors are desired within the same workflow, training and using multiple adapters is cumbersome. We propose IPAdapter-Instruct, which combines natural-image conditioning with “Instruct” prompts to swap between interpretations for the same conditioning image: style transfer, object extraction, both, or something else still? IPAdapterInstruct efficiently learns multiple tasks with minimal loss in quality compared to dedicated per-task models.

Cite

Text

Rowles et al. "IPAdapter-Instruct: Resolving Ambiguity in Image-Based Conditioning Using Instruct Prompts." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91838-4_4

Markdown

[Rowles et al. "IPAdapter-Instruct: Resolving Ambiguity in Image-Based Conditioning Using Instruct Prompts." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/rowles2024eccvw-ipadapterinstruct/) doi:10.1007/978-3-031-91838-4_4

BibTeX

@inproceedings{rowles2024eccvw-ipadapterinstruct,
  title     = {{IPAdapter-Instruct: Resolving Ambiguity in Image-Based Conditioning Using Instruct Prompts}},
  author    = {Rowles, Ciara and Vainer, Shimon and De Nigris, Dante and Elizarov, Slava and Kutsy, Konstantin and Donné, Simon},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2024},
  pages     = {54-70},
  doi       = {10.1007/978-3-031-91838-4_4},
  url       = {https://mlanthology.org/eccvw/2024/rowles2024eccvw-ipadapterinstruct/}
}