Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries

Abstract

Learning human objectives from preference feedback has significantly advanced reinforcement learning (RL) in domains with hard-to-formalize objectives. Traditional methods with pairwise trajectory comparisons face challenges: trajectories with subtle differences are hard to compare, and comparisons are ordinal, limiting direct inference of preference strength. In this paper, we introduce the distinguishability query, where humans compare two pairs of trajectories and indicate which pair is easier to compare and then give preference feedback on the easier pair. This type of query directly infers preference strength and is expected to reduce cognitive load on the labeler. We also connect this query to cardinal utility and difference relations, and develop an efficient query selection scheme to achieve better trade-off between query informativeness and easiness. Experimental results empirically demonstrates the potential of our method for faster, data-efficient learning and improved user-friendliness on RLHF benchmarks.

Cite

Text

Feng et al. "Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries." ICML 2024 Workshops: MFHAIA, 2024.

Markdown

[Feng et al. "Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries." ICML 2024 Workshops: MFHAIA, 2024.](https://mlanthology.org/icmlw/2024/feng2024icmlw-comparing/)

BibTeX

@inproceedings{feng2024icmlw-comparing,
  title     = {{Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries}},
  author    = {Feng, Xuening and Jiang, Zhaohui and Kaufmann, Timo and Hüllermeier, Eyke and Weng, Paul and Zhu, Yifei},
  booktitle = {ICML 2024 Workshops: MFHAIA},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/feng2024icmlw-comparing/}
}