Perception Score: A Learned Metric for Open-Ended Text Generation Evaluation
Abstract
Automatic evaluation for open-ended natural language generation tasks remains a challenge. We propose a learned evaluation metric: Perception Score. It utilizes a pre-trained model and considers context information for conditional generation. Perception Score assigns a holistic score along with the uncertainty measurement. We conduct experiments on three open-ended conditional generation tasks and two open-ended unconditional generation tasks. Perception Score achieves state-of-the-art results on all the tasks consistently in terms of correlation with human evaluation scores.
Cite
Text
Gu et al. "Perception Score: A Learned Metric for Open-Ended Text Generation Evaluation." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I14.17526Markdown
[Gu et al. "Perception Score: A Learned Metric for Open-Ended Text Generation Evaluation." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/gu2021aaai-perception/) doi:10.1609/AAAI.V35I14.17526BibTeX
@inproceedings{gu2021aaai-perception,
title = {{Perception Score: A Learned Metric for Open-Ended Text Generation Evaluation}},
author = {Gu, Jing and Wu, Qingyang and Yu, Zhou},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {12902-12910},
doi = {10.1609/AAAI.V35I14.17526},
url = {https://mlanthology.org/aaai/2021/gu2021aaai-perception/}
}