Precise and Interpretable Editing of Code Knowledge in Large Language Models

Abstract

Large Language Models (LLMs) have demonstrated outstanding capabilities in various code-related tasks, including code completion, translation, or summarization. However, these pretrained models are static, posing a challenge to incorporate new knowledge into an LLM to correct erroneous behavior. Approaches such as retraining or fine-tuning demand extensive labeled datasets and might be computationally expensive, while prompt engineering fails to change models permanently. Knowledge Editing (KE) techniques offer a more efficient alternative, enabling model updates with minimal data, even just a single example. Nevertheless, existing KE methods often manipulate parameters within the Transformer's multi-layer perceptrons (MLPs), where neuronal polysemanticity hinders both the precision and interpretability of the edits. To address these limitations, we exploit TransCoder, an MLP-like model component with a wide and sparsely activated hidden feature vector. Specifically, we introduce **TransCoder-based Precise Editing** (**TCPE**), a novel method that leverages the sparsity and monosemanticity of the TransCoder’s neurons for highly localized knowledge editing. TCPE exhibits neuron-level mechanistic interpretability characteristics, revealing the correspondence between the edited neurons and the specific code-related knowledge. Furthermore, we present KECode, a new evaluation benchmark for code-to-code translation based on functional equivalence. Using KECode, we conduct a systematic evaluation of representative KE methods in the context of code-to-code translation. Our experimental results demonstrate that TCPE outperforms existing KE methods, achieving a substantial improvement of translation accuracy of CodeLlama-7b-Instruct from 57.5% to 64.0% in a low-resource scenario of Java-to-D translation.

Cite

Text

Xue et al. "Precise and Interpretable Editing of Code Knowledge in Large Language Models." International Conference on Learning Representations, 2026.

Markdown

[Xue et al. "Precise and Interpretable Editing of Code Knowledge in Large Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/xue2026iclr-precise/)

BibTeX

@inproceedings{xue2026iclr-precise,
  title     = {{Precise and Interpretable Editing of Code Knowledge in Large Language Models}},
  author    = {Xue, Min and Bolik, Nikolai and Stöpler, Lennart and Imgrund, Erik and Schmid, Janik and Andrzejak, Artur},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/xue2026iclr-precise/}
}