TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, prompting a surge in their practical applications. However, concerns have arisen regarding the trustworthiness of LLMs' outputs, particularly in closed-book question-answering tasks, where non-experts may struggle to identify inaccuracies due to the absence of contextual or ground truth information. This paper introduces TrustScore, a framework based on the concept of Behavioral Consistency, which evaluates whether an LLM's response aligns with its intrinsic knowledge. Additionally, TrustScore can seamlessly integrate with fact-checking methods, which assesses alignment with external knowledge sources. The experimental results show that TrustScore achieves strong correlations with human judgments, surpassing existing reference-free metrics, and achieving results on par with reference-based metrics.

Cite

Text

Zheng et al. "TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness." ICLR 2024 Workshops: SeT_LLM, 2024.

Markdown

[Zheng et al. "TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness." ICLR 2024 Workshops: SeT_LLM, 2024.](https://mlanthology.org/iclrw/2024/zheng2024iclrw-trustscore/)

BibTeX

@inproceedings{zheng2024iclrw-trustscore,
  title     = {{TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness}},
  author    = {Zheng, Danna and Liu, Danyang and Lapata, Mirella and Pan, Jeff Z.},
  booktitle = {ICLR 2024 Workshops: SeT_LLM},
  year      = {2024},
  url       = {https://mlanthology.org/iclrw/2024/zheng2024iclrw-trustscore/}
}