Towards Neural Similarity Evaluator
Abstract
We review three limitations of BLEU and ROUGE – the most popular metrics used to assess reference summaries against hypothesis summaries, come up with criteria for what a good metric should behave like and propose concrete ways to assess the performance of a metric in detail and show the potential of Transformers-based Language Models to assess reference summaries against hypothesis summaries.
Cite
Text
Kané et al. "Towards Neural Similarity Evaluator." NeurIPS 2019 Workshops: Document_Intelligence, 2019.Markdown
[Kané et al. "Towards Neural Similarity Evaluator." NeurIPS 2019 Workshops: Document_Intelligence, 2019.](https://mlanthology.org/neuripsw/2019/kane2019neuripsw-neural/)BibTeX
@inproceedings{kane2019neuripsw-neural,
title = {{Towards Neural Similarity Evaluator}},
author = {Kané, Hassan and Kocyigit, Yusuf and Ajanoh, Pelkins and Abdalla, Ali and Coulibali, Mohamed},
booktitle = {NeurIPS 2019 Workshops: Document_Intelligence},
year = {2019},
url = {https://mlanthology.org/neuripsw/2019/kane2019neuripsw-neural/}
}