Group Preference Optimization: Few-Shot Alignment of Large Language Models

Abstract

Applications of large language models (LLMs) often demand nuanced judgments that vary among different groups. Existing alignment algorithms can be costly, requiring extensive group-specific data and computation. We present Group Preference Optimization (GPO), a framework that efficiently aligns LLMs to group preferences using a few-shot approach. In GPO, we augment the base LLM with an independent transformer module to predict the preferences of a group for the LLM generations. For few-shot learning, this module acts as an in-context autoregressive transformer and is trained via meta-learning on several groups. Through empirical validation on opinion adaptation tasks involving US demographic groups, global countries, and individuals, GPO demonstrates superior alignment performance, requiring fewer group-specific preferences and reduced training and computational resources, surpassing existing strategies like in-context steering and fine-tuning.

Cite

Text

Zhao et al. "Group Preference Optimization: Few-Shot Alignment of Large Language Models." NeurIPS 2023 Workshops: R0-FoMo, 2023.

Markdown

[Zhao et al. "Group Preference Optimization: Few-Shot Alignment of Large Language Models." NeurIPS 2023 Workshops: R0-FoMo, 2023.](https://mlanthology.org/neuripsw/2023/zhao2023neuripsw-group-a/)

BibTeX

@inproceedings{zhao2023neuripsw-group-a,
  title     = {{Group Preference Optimization: Few-Shot Alignment of Large Language Models}},
  author    = {Zhao, Siyan and Dang, John and Grover, Aditya},
  booktitle = {NeurIPS 2023 Workshops: R0-FoMo},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/zhao2023neuripsw-group-a/}
}