SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Abstract

We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks. The source code is available at https://github.com/facebookresearch/SelfCite.

Cite

Text

Chuang et al. "SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Chuang et al. "SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/chuang2025icml-selfcite/)

BibTeX

@inproceedings{chuang2025icml-selfcite,
  title     = {{SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models}},
  author    = {Chuang, Yung-Sung and Cohen-Wang, Benjamin and Shen, Zejiang and Wu, Zhaofeng and Xu, Hu and Lin, Xi Victoria and Glass, James R. and Li, Shang-Wen and Yih, Wen-Tau},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {10839-10858},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/chuang2025icml-selfcite/}
}