Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI
Abstract
Emerging research in Pluralistic AI alignment seeks to address how to design and deploy intelligent systems in accordance with diverse human needs and values.We contribute a potential approach for aligning AI with diverse and shifting user preferences through Multi-Objective Reinforcement Learning (MORL), via post-learning policy selection adjustment. This paper introduces the proposed framework, outlines its anticipated advantages and assumptions, and discusses technical details for implementation. We also examine the broader implications of adopting a retroactive alignment approach from a sociotechnical systems perspective.
Cite
Text
Harland et al. "Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI." NeurIPS 2024 Workshops: Pluralistic-Alignment, 2024.Markdown
[Harland et al. "Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI." NeurIPS 2024 Workshops: Pluralistic-Alignment, 2024.](https://mlanthology.org/neuripsw/2024/harland2024neuripsw-adaptive/)BibTeX
@inproceedings{harland2024neuripsw-adaptive,
title = {{Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI}},
author = {Harland, Hadassah and Dazeley, Richard and Vamplew, Peter and Senaratne, Hashini and Nakisa, Bahareh and Cruz, Francisco},
booktitle = {NeurIPS 2024 Workshops: Pluralistic-Alignment},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/harland2024neuripsw-adaptive/}
}