Navigating the Social Welfare Frontier: Portfolios for Multi-Objective Reinforcement Learning

Abstract

In many real-world applications of Reinforcement Learning (RL), deployed policies have varied impacts on different stakeholders, creating challenges in reaching consensus on how to effectively aggregate their preferences. Generalized $p$-means form a widely used class of social welfare functions for this purpose, with broad applications in fair resource allocation, AI alignment, and decision-making. This class includes well-known welfare functions such as Egalitarian, Nash, and Utilitarian welfare. However, selecting the appropriate social welfare function is challenging for decision-makers, as the structure and outcomes of optimal policies can be highly sensitive to the choice of $p$. To address this challenge, we study the concept of an $\alpha$-approximate portfolio in RL, a set of policies that are approximately optimal across the family of generalized $p$-means for all $p \in [-\infty, 1]$. We propose algorithms to compute such portfolios and provide theoretical guarantees on the trade-offs among approximation factor, portfolio size, and computational efficiency. Experimental results on synthetic and real-world datasets demonstrate the effectiveness of our approach in summarizing the policy space induced by varying $p$ values, empowering decision-makers to navigate this landscape more effectively.

Cite

Text

Kim et al. "Navigating the Social Welfare Frontier: Portfolios for Multi-Objective Reinforcement Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Kim et al. "Navigating the Social Welfare Frontier: Portfolios for Multi-Objective Reinforcement Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/kim2025icml-navigating/)

BibTeX

@inproceedings{kim2025icml-navigating,
  title     = {{Navigating the Social Welfare Frontier: Portfolios for Multi-Objective Reinforcement Learning}},
  author    = {Kim, Cheol Woo and Moondra, Jai and Verma, Shresth and Pollack, Madeleine and Kong, Lingkai and Tambe, Milind and Gupta, Swati},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {30631-30653},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/kim2025icml-navigating/}
}