Order-Free RNN with Visual Attention for Multi-Label Classification
Abstract
We propose a recurrent neural network (RNN) based model for image multi-label classification. Our model uniquely integrates and learning of visual attention and Long Short Term Memory (LSTM) layers, which jointly learns the labels of interest and their co-occurrences, while the associated image regions are visually attended. Different from existing approaches utilize either model in their network architectures, training of our model does not require pre-defined label orders. Moreover, a robust inference process is introduced so that prediction errors would not propagate and thus affect the performance. Our experiments on NUS-WISE and MS-COCO datasets confirm the design of our network and its effectiveness in solving multi-label classification problems.
Cite
Text
Chen et al. "Order-Free RNN with Visual Attention for Multi-Label Classification." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.12230Markdown
[Chen et al. "Order-Free RNN with Visual Attention for Multi-Label Classification." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/chen2018aaai-order/) doi:10.1609/AAAI.V32I1.12230BibTeX
@inproceedings{chen2018aaai-order,
title = {{Order-Free RNN with Visual Attention for Multi-Label Classification}},
author = {Chen, Shang-Fu and Chen, Yi-Chen and Yeh, Chih-Kuan and Wang, Yu-Chiang Frank},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2018},
pages = {6714-6721},
doi = {10.1609/AAAI.V32I1.12230},
url = {https://mlanthology.org/aaai/2018/chen2018aaai-order/}
}