Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning

Abstract

Large Language Models (LLMs) are increasingly used to simulate human users in interactive settings such as therapy, education, and social role-play. While these simulations enable scalable training and evaluation of AI agents, off-the-shelf LLMs often drift from their assigned personas, contradict earlier statements, or abandon role-appropriate behavior. We introduce a unified framework for evaluating and improving persona consistency in LLM-generated dialogue. We define three automatic metrics—prompt-to-line consistency, line-to-line consistency, and Q\&A consistency—that capture different types of persona drift and validate each against human annotations. Using these metrics as reward signals, we apply multi-turn reinforcement learning to fine-tune LLMs for three user roles: a patient, a student, and a social chat partner. Our method reduces inconsistency by over 55%, resulting in more coherent, faithful, and trustworthy simulated users.

Cite

Text

Abdulhai et al. "Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.

Markdown

[Abdulhai et al. "Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/abdulhai2025neurips-consistently/)

BibTeX

@inproceedings{abdulhai2025neurips-consistently,
  title     = {{Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning}},
  author    = {Abdulhai, Marwa and Cheng, Ryan and Clay, Donovan and Althoff, Tim and Levine, Sergey and Jaques, Natasha},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/abdulhai2025neurips-consistently/}
}