Multi-Modal Multi-Task Learning for Automatic Dietary Assessment
Abstract
We investigate the task of automatic dietary assessment: given meal images and descriptions uploaded by real users, our task is to automatically rate the meals and deliver advisory comments for improving users' diets. To address this practical yet challenging problem, which is multi-modal and multi-task in nature, an end-to-end neural model is proposed. In particular, comprehensive meal representations are obtained from images, descriptions and user information. We further introduce a novel memory network architecture to store meal representations and reason over the meal representations to support predictions. Results on a real-world dataset show that our method outperforms two strong image captioning baselines significantly.
Cite
Text
Liu et al. "Multi-Modal Multi-Task Learning for Automatic Dietary Assessment." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.11848Markdown
[Liu et al. "Multi-Modal Multi-Task Learning for Automatic Dietary Assessment." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/liu2018aaai-multi-a/) doi:10.1609/AAAI.V32I1.11848BibTeX
@inproceedings{liu2018aaai-multi-a,
title = {{Multi-Modal Multi-Task Learning for Automatic Dietary Assessment}},
author = {Liu, Qi and Zhang, Yue and Liu, Zhenguang and Yuan, Ye and Cheng, Li and Zimmermann, Roger},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2018},
pages = {2347-2354},
doi = {10.1609/AAAI.V32I1.11848},
url = {https://mlanthology.org/aaai/2018/liu2018aaai-multi-a/}
}