Quantifying the Plausibility of Context Reliance in Neural Machine Translation

Abstract

Establishing whether language models can use contextual information in a human-plausible way is important to ensure their safe adoption in real-world settings. However, the questions of $\textit{when}$ and $\textit{which parts}$ of the context affect model generations are typically tackled separately, and current plausibility evaluations are practically limited to a handful of artificial benchmarks. To address this, we introduce $\textbf{P}$lausibility $\textbf{E}$valuation of $\textbf{Co}$ntext $\textbf{Re}$liance (PECoRe), an end-to-end interpretability framework designed to quantify context usage in language models' generations. Our approach leverages model internals to (i) contrastively identify context-sensitive target tokens in generated texts and (ii) link them to contextual cues justifying their prediction. We use PECoRe to quantify the plausibility of context-aware machine translation models, comparing model rationales with human annotations across several discourse-level phenomena. Finally, we apply our method to unannotated model translations to identify context-mediated predictions and highlight instances of (im)plausible context usage throughout generation.

Cite

Text

Sarti et al. "Quantifying the Plausibility of Context Reliance in Neural Machine Translation." International Conference on Learning Representations, 2024.

Markdown

[Sarti et al. "Quantifying the Plausibility of Context Reliance in Neural Machine Translation." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/sarti2024iclr-quantifying/)

BibTeX

@inproceedings{sarti2024iclr-quantifying,
  title     = {{Quantifying the Plausibility of Context Reliance in Neural Machine Translation}},
  author    = {Sarti, Gabriele and Chrupała, Grzegorz and Nissim, Malvina and Bisazza, Arianna},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/sarti2024iclr-quantifying/}
}