Entropy-Based Decoding for Retrieval-Augmented Large Language Models

Abstract

Augmenting Large Language Models (LLMs) with retrieved external knowledge has proven effective for improving the factual accuracy of generated responses. Despite their success, retrieval-augmented LLMs still face the distractibility issue, where the generated responses are negatively influenced by noise from both external and internal knowledge sources. In this paper, we introduce a novel, training-free decoding method guided by entropy considerations to mitigate this issue. Our approach utilizes entropy-based document-parallel ensemble decoding to prioritize low-entropy distributions from retrieved documents, thereby enhancing the extraction of relevant information of context. Additionally, it incorporates a contrastive decoding mechanism that contrasts the obtained low-entropy ensemble distribution with the high-entropy distribution derived from the model's internal knowledge across layers, which ensures a greater emphasis on reliable external information. Extensive experiments on open-domain question answering datasets demonstrate the superiority of our method.

Cite

Text

Qiu et al. "Entropy-Based Decoding for Retrieval-Augmented Large Language Models." NeurIPS 2024 Workshops: MINT, 2024.

Markdown

[Qiu et al. "Entropy-Based Decoding for Retrieval-Augmented Large Language Models." NeurIPS 2024 Workshops: MINT, 2024.](https://mlanthology.org/neuripsw/2024/qiu2024neuripsw-entropybased/)

BibTeX

@inproceedings{qiu2024neuripsw-entropybased,
  title     = {{Entropy-Based Decoding for Retrieval-Augmented Large Language Models}},
  author    = {Qiu, Zexuan and Ou, Zijing and Wu, Bin and Li, Jingjing and Liu, Aiwei and King, Irwin},
  booktitle = {NeurIPS 2024 Workshops: MINT},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/qiu2024neuripsw-entropybased/}
}