Mitigating Lies in Vision-Language Models
Abstract
In this work, we bring new insights into the honesty of vision-language models, particularly in visual question answering (VQA). After a throughout revisit of the existing ‘lie’ behavior in pure language models, our work makes an unprecedented extension of ’lies’ to vision-language models. The results indicate that the lie prefixes have a more obvious misleading effect on vision-language models than on language models. We also propose a novel visual prefix and prove that the consistent vision-language prefix is more threatening to vision-language models. To defend the models from the stated ’lies’, we put forward an unsupervised framework based on Gaussian mixture modeling and obtain improvement with 3% against the language prefix and 12% against the vision-language prefix.
Cite
Text
Li et al. "Mitigating Lies in Vision-Language Models." NeurIPS 2022 Workshops: MLSW, 2022.Markdown
[Li et al. "Mitigating Lies in Vision-Language Models." NeurIPS 2022 Workshops: MLSW, 2022.](https://mlanthology.org/neuripsw/2022/li2022neuripsw-mitigating/)BibTeX
@inproceedings{li2022neuripsw-mitigating,
title = {{Mitigating Lies in Vision-Language Models}},
author = {Li, Junbo and Li, Xianhang and Xie, Cihang},
booktitle = {NeurIPS 2022 Workshops: MLSW},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/li2022neuripsw-mitigating/}
}