Measuring and Enhancing Trustworthiness of LLMs in RAG Through Grounded Attributions and Learning to Refuse
Abstract
LLMs are an integral component of retrieval-augmented generation (RAG) systems. While many studies focus on evaluating the overall quality of end-to-end RAG systems, there is a gap in understanding the appropriateness of LLMs for the RAG task. To address this, we introduce Trust-Score, a holistic metric that evaluates the trustworthiness of LLMs within the RAG framework. Our results show that various prompting methods, such as in-context learning, fail to effectively adapt LLMs to the RAG task as measured by Trust-Score. Consequently, we propose Trust-Align, a method to align LLMs for improved Trust-Score performance. 26 out of 27 models aligned using Trust-Align substantially outperform competitive baselines on ASQA, QAMPARI, and ELI5. Specifically, in LLaMA-3-8b, Trust-Align outperforms FRONT on ASQA (↑12.56), QAMPARI (↑36.04), and ELI5 (↑17.69). Trust-Align also significantly enhances models’ ability to correctly refuse and provide quality citations. We also demonstrate the effectiveness of Trust-Align across different open-weight models, including the LLaMA series (1b to 8b), Qwen-2.5 series (0.5b to 7b), and Phi3.5 (3.8b). We release our code at https://github.com/declare-lab/trust-align.
Cite
Text
Song et al. "Measuring and Enhancing Trustworthiness of LLMs in RAG Through Grounded Attributions and Learning to Refuse." International Conference on Learning Representations, 2025.Markdown
[Song et al. "Measuring and Enhancing Trustworthiness of LLMs in RAG Through Grounded Attributions and Learning to Refuse." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/song2025iclr-measuring/)BibTeX
@inproceedings{song2025iclr-measuring,
title = {{Measuring and Enhancing Trustworthiness of LLMs in RAG Through Grounded Attributions and Learning to Refuse}},
author = {Song, Maojia and Sim, Shang Hong and Bhardwaj, Rishabh and Chieu, Hai Leong and Majumder, Navonil and Poria, Soujanya},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/song2025iclr-measuring/}
}