Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning
Abstract
Human-oriented image captioning with both high diversity and accuracy is a challenging task in vision+language modeling. The reinforcement learning (RL) based frameworks promote the accuracy of image captioning, yet seriously hurt the diversity. In contrast, other methods based on variational auto-encoder (VAE) or generative adversarial network (GAN) can produce diverse yet less accurate captions. In this work, we devote our attention to promote the diversity of RL-based image captioning. To be specific, we devise a partial off-policy learning scheme to balance accuracy and diversity. First, we keep the model exposed to varied candidate captions by sampling from the initial state before RL launched. Second, a novel criterion named max-CIDEr is proposed to serve as the reward for promoting diversity. We combine the above-mentioned off-policy strategy with the on-policy one to moderate the exploration effect, further balancing the diversity and accuracy for human-like image captioning. Experiments show that our method locates the closest to human performance in the diversity-accuracy space, and achieves the highest Pearson correlation as 0.337 with human performance.
Cite
Text
Shi et al. "Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00219Markdown
[Shi et al. "Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/shi2021iccv-partial/) doi:10.1109/ICCV48922.2021.00219BibTeX
@inproceedings{shi2021iccv-partial,
title = {{Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning}},
author = {Shi, Jiahe and Li, Yali and Wang, Shengjin},
booktitle = {International Conference on Computer Vision},
year = {2021},
pages = {2187-2196},
doi = {10.1109/ICCV48922.2021.00219},
url = {https://mlanthology.org/iccv/2021/shi2021iccv-partial/}
}