LLM Alignment Using Soft Prompt Tuning: The Case of Cultural Alignment

Masoud, Reem I.; Ferianc, Martin; Treleaven, Philip Colin; Rodrigues, Miguel R. D.

LLM Alignment Using Soft Prompt Tuning: The Case of Cultural Alignment

Reem I. Masoud, Martin Ferianc, Philip Colin Treleaven, Miguel R. D. Rodrigues

NeurIPSW 2024

/neuripsw/2024/masoud2024neuripsw-llm/

Abstract

Large Language Model (LLM) alignment traditionally relies on supervised fine-tuning or alignment frameworks such as Kullback-Leibler (KL) regularization and reward models. These methods typically require labeled or preference datasets and involve updating model weights to align the LLM with the training objective or reward model. In the realm of cultural alignment, the non-differentiable nature of cultural dimensions renders these methods infeasible. To overcome this, we propose a scalable strategy that combines soft prompt tuning—which freezes the model parameters while modifying the input prompt embeddings—with Differential Evolution (DE), a black-box optimization method for cases where a differentiable objective is unattainable. This strategy ensures alignment consistency without the need for preference data or model parameter updates, significantly enhancing efficiency and mitigating overfitting. Our empirical findings indicate marked advancements in aligning LLM behavior within intricate cultural contexts, demonstrating the proposed method’s practicality and effectiveness. This work contributes to closing the gap between computational models and the complexities of human culture, offering a significant step forward in the nuanced alignment of LLMs across diverse human contexts.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Masoud et al. "LLM Alignment Using Soft Prompt Tuning: The Case of Cultural Alignment." NeurIPS 2024 Workshops: SoLaR, 2024.

Markdown

[Masoud et al. "LLM Alignment Using Soft Prompt Tuning: The Case of Cultural Alignment." NeurIPS 2024 Workshops: SoLaR, 2024.](https://mlanthology.org/neuripsw/2024/masoud2024neuripsw-llm/)

BibTeX

@inproceedings{masoud2024neuripsw-llm,
  title     = {{LLM Alignment Using Soft Prompt Tuning: The Case of Cultural Alignment}},
  author    = {Masoud, Reem I. and Ferianc, Martin and Treleaven, Philip Colin and Rodrigues, Miguel R. D.},
  booktitle = {NeurIPS 2024 Workshops: SoLaR},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/masoud2024neuripsw-llm/}
}