Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics

Abstract

Multi-modal large language models (MLLMs) are trained based on large language models (LLM), with an enhanced capability to comprehend multi-modal inputs and generate textual responses. While they excel in multi-modal tasks, the pure NLP abilities of MLLMs are often underestimated and left untested. In this study, we get out of the box and unveil an intriguing characteristic of MLLMs --- our preliminary results suggest that visual instruction tuning, a prevailing strategy for transitioning LLMs into MLLMs, unexpectedly and interestingly helps models attain both improved truthfulness and ethical alignment in the pure NLP context. For example, a visual-instruction-tuned LLaMA2 7B model surpasses the performance of the LLaMA2-chat 7B model, fine-tuned with over one million human annotations, on \texttt{TruthfulQA} and \texttt{Ethics} benchmarks. Further analysis reveals that the improved alignment can be attributed to the superior instruction quality inherent to visual-text data. In releasing our code at \url{github.com/UCSC-VLAA/Sight-Beyond-Text}, we aspire to foster further exploration into the intrinsic value of visual-text synergies and, in a broader scope, multi-modal interactions in alignment research.

Cite

Text

Tu et al. "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics." NeurIPS 2023 Workshops: Instruction, 2023.

Markdown

[Tu et al. "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics." NeurIPS 2023 Workshops: Instruction, 2023.](https://mlanthology.org/neuripsw/2023/tu2023neuripsw-sight/)

BibTeX

@inproceedings{tu2023neuripsw-sight,
  title     = {{Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics}},
  author    = {Tu, Haoqin and Zhao, Bingchen and Wei, Chen and Xie, Cihang},
  booktitle = {NeurIPS 2023 Workshops: Instruction},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/tu2023neuripsw-sight/}
}