NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context

Abstract

While LLMs have demonstrated medical knowledge and conversational ability, their deployment in clinical practice raises new risks: patients may place greater trust in LLM-generated responses than in nurses' professional judgments, potentially intensifying nurse–patient conflicts. Such risks highlight the urgent need of evaluating whether LLMs align with the core nursing values upheld by human nurses. This work introduces the first benchmark for nursing value alignment, consisting of five core value dimensions distilled from international nursing codes: _Altruism_, _Human Dignity_, _Integrity_, _Justice_, and _Professionalism_. We define two-level tasks on the benchmark, considering the two characteristics of emerging nurse–patient conflicts. The **Easy-Level** dataset consists of 2,200 value-aligned and value-violating instances, which are collected through a five-month longitudinal field study across three hospitals of varying tiers; The **Hard-Level** dataset is comprised of 2,200 dialogue-based instances that embed contextual cues and subtle misleading signals, which increase adversarial complexity and better reflect the subjectivity and bias of narrators in the context of emerging nurse-patient conflicts. We evaluate a total of 23 SoTA LLMs on their ability to align with nursing values, and find that general LLMs outperform medical ones, and _Justice_ is the hardest value dimension. As the first real-world benchmark for healthcare value alignment, NurValues provides novel insights into how LLMs navigate ethical challenges in clinician–patient interactions.

Cite

Text

Yao et al. "NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context." International Conference on Learning Representations, 2026.

Markdown

[Yao et al. "NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/yao2026iclr-nurvalues/)

BibTeX

@inproceedings{yao2026iclr-nurvalues,
  title     = {{NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context}},
  author    = {Yao, Ben and Li, Qiuchi and Zhang, Yazhou and Siyu, Yang and Zhang, Bohan and Tiwari, Prayag and Qin, Jing},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/yao2026iclr-nurvalues/}
}