Towards Neural Similarity Evaluator

Abstract

We review three limitations of BLEU and ROUGE – the most popular metrics used to assess reference summaries against hypothesis summaries, come up with criteria for what a good metric should behave like and propose concrete ways to assess the performance of a metric in detail and show the potential of Transformers-based Language Models to assess reference summaries against hypothesis summaries.

Cite

Text

Kané et al. "Towards Neural Similarity Evaluator." NeurIPS 2019 Workshops: Document_Intelligence, 2019.

Markdown

[Kané et al. "Towards Neural Similarity Evaluator." NeurIPS 2019 Workshops: Document_Intelligence, 2019.](https://mlanthology.org/neuripsw/2019/kane2019neuripsw-neural/)

BibTeX

@inproceedings{kane2019neuripsw-neural,
  title     = {{Towards Neural Similarity Evaluator}},
  author    = {Kané, Hassan and Kocyigit, Yusuf and Ajanoh, Pelkins and Abdalla, Ali and Coulibali, Mohamed},
  booktitle = {NeurIPS 2019 Workshops: Document_Intelligence},
  year      = {2019},
  url       = {https://mlanthology.org/neuripsw/2019/kane2019neuripsw-neural/}
}