Instruction-Augmented Multimodal Alignment for Image-Text and Element Matching

Cite

Text

Yue et al. "Instruction-Augmented Multimodal Alignment for Image-Text and Element Matching." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.

Markdown

[Yue et al. "Instruction-Augmented Multimodal Alignment for Image-Text and Element Matching." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.](https://mlanthology.org/cvprw/2025/yue2025cvprw-instructionaugmented/)

BibTeX

@inproceedings{yue2025cvprw-instructionaugmented,
  title     = {{Instruction-Augmented Multimodal Alignment for Image-Text and Element Matching}},
  author    = {Yue, Xinli and Sun, Jianhui and Lu, Junda and Yao, Liangchao and Xia, Fan and Wang, Tianyi and Rao, Fengyun and Lyu, Jing and Deng, Yuetang},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2025},
  pages     = {1379-1388},
  url       = {https://mlanthology.org/cvprw/2025/yue2025cvprw-instructionaugmented/}
}