Quantifying the Amount of Visual Information Used by Neural Caption Generators
Abstract
Image caption generation systems are typically evaluated against reference outputs. We show that it is possible to predict output quality without generating the captions, based on the probability assigned by the neural model to the reference captions. Such pre-gen metrics are strongly correlated to standard evaluation metrics.
Cite
Text
Tanti et al. "Quantifying the Amount of Visual Information Used by Neural Caption Generators." European Conference on Computer Vision Workshops, 2018. doi:10.1007/978-3-030-11018-5_11Markdown
[Tanti et al. "Quantifying the Amount of Visual Information Used by Neural Caption Generators." European Conference on Computer Vision Workshops, 2018.](https://mlanthology.org/eccvw/2018/tanti2018eccvw-quantifying/) doi:10.1007/978-3-030-11018-5_11BibTeX
@inproceedings{tanti2018eccvw-quantifying,
title = {{Quantifying the Amount of Visual Information Used by Neural Caption Generators}},
author = {Tanti, Marc and Gatt, Albert and Camilleri, Kenneth P.},
booktitle = {European Conference on Computer Vision Workshops},
year = {2018},
pages = {124-132},
doi = {10.1007/978-3-030-11018-5_11},
url = {https://mlanthology.org/eccvw/2018/tanti2018eccvw-quantifying/}
}