Policy Dreamer: Diverse Public Policy Generation via Elicitation and Simulation of Human Preferences

Karanam, Arjun; Enríquez, José Ramón; Sehwag, Udari Madhushani; Elabd, Michael; Gandhi, Kanishk; Goodman, Noah; Koyejo, Sanmi

Policy Dreamer: Diverse Public Policy Generation via Elicitation and Simulation of Human Preferences

Arjun Karanam, José Ramón Enríquez, Udari Madhushani Sehwag, Michael Elabd, Kanishk Gandhi, Noah Goodman, Sanmi Koyejo

NeurIPSW 2024

/neuripsw/2024/karanam2024neuripsw-policy/

Abstract

Developing public policies that effectively address complex societal issues while representing diverse perspectives remains a significant challenge in governance and policy-making. This paper presents Policy Dreamer, an evolutionary dynamics-based preference aggregation method designed to create public policy that aligns with heterogeneous populations while preserving solution diversity. It does so in three stages: a) Initial Public Policy Generation (where public policies are defined as a set of goals, actions, and strategies aimed at addressing specific societal issues), b) Preference Elicitation from a constituency of humans, and c) Policy Refinement using simulated human feedback. We apply this approach to the domain of creating public policy, which require navigating complex socioeconomic trade-offs. To validate our method, we measure our system's ability to create popular yet diverse policy proposals in the following domains: Healthcare, Gun Control, and Social Media regulation. Our approach iteratively aligns policies with respect to a base constituency, while using evolutionary search to ensure that policy diversity is not compromised. When compared to an expert-crafted set of policies, it is able to generate novel policies, with up to 25\% of generated policies being novel. However, it exhibits limitations in capturing the full diversity of these expert-crafted policies, particularly in controversial or emerging policy domains. Overall, our preliminary results suggest that Large Language Models (LLMs) are able to actively elicit preferences from a constituency of people, and iteratively generate statements (public policies) that align with this constituency while preventing a collapse in statement diversity.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Karanam et al. "Policy Dreamer: Diverse Public Policy Generation via Elicitation and Simulation of Human Preferences." NeurIPS 2024 Workshops: SoLaR, 2024.

Markdown

[Karanam et al. "Policy Dreamer: Diverse Public Policy Generation via Elicitation and Simulation of Human Preferences." NeurIPS 2024 Workshops: SoLaR, 2024.](https://mlanthology.org/neuripsw/2024/karanam2024neuripsw-policy/)

BibTeX

@inproceedings{karanam2024neuripsw-policy,
  title     = {{Policy Dreamer: Diverse Public Policy Generation via Elicitation and Simulation of Human Preferences}},
  author    = {Karanam, Arjun and Enríquez, José Ramón and Sehwag, Udari Madhushani and Elabd, Michael and Gandhi, Kanishk and Goodman, Noah and Koyejo, Sanmi},
  booktitle = {NeurIPS 2024 Workshops: SoLaR},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/karanam2024neuripsw-policy/}
}