When Models Know More than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

Shi, Quan; Jimenez, Carlos E; Yao, Shunyu; Haber, Nick; Yang, Diyi; Narasimhan, Karthik R

When Models Know More than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

Quan Shi, Carlos E Jimenez, Shunyu Yao, Nick Haber, Diyi Yang, Karthik R Narasimhan

NeurIPS 2025

/neurips/2025/shi2025neurips-models/

Abstract

As large language models (LLMs) increasingly serve as close collaborators for humans, it is crucial that they express their reasoning in ways that humans can understand and learn from. However, this capability remains relatively less understood and under-evaluated. To address this, we introduce a conceptual framework for such Human-AI knowledge transfer capabilities and conduct the first large-scale user study (N=118) explicitly designed to measure it. In our two-phase setup, humans first ideate with an LLM on problem-solving strategies, then independently implement solutions, isolating the influence of model reasoning on human understanding. Our findings reveal that while model benchmark performance correlates with collaborative outcomes, this relationship is notably inconsistent with significant outliers, highlighting that knowledge transfer is a distinct capability requiring dedicated optimization. Our analysis uncovers behavioral and strategic factors that mediate successful knowledge transfer, and we release our code, dataset, and evaluation framework to support future work on communicatively aligned models.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Shi et al. "When Models Know More than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration." Advances in Neural Information Processing Systems, 2025.

Markdown

[Shi et al. "When Models Know More than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/shi2025neurips-models/)

BibTeX

@inproceedings{shi2025neurips-models,
  title     = {{When Models Know More than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration}},
  author    = {Shi, Quan and Jimenez, Carlos E and Yao, Shunyu and Haber, Nick and Yang, Diyi and Narasimhan, Karthik R},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/shi2025neurips-models/}
}