Multi-Modal Multi-Task Learning for Automatic Dietary Assessment

Abstract

We investigate the task of automatic dietary assessment: given meal images and descriptions uploaded by real users, our task is to automatically rate the meals and deliver advisory comments for improving users' diets. To address this practical yet challenging problem, which is multi-modal and multi-task in nature, an end-to-end neural model is proposed. In particular, comprehensive meal representations are obtained from images, descriptions and user information. We further introduce a novel memory network architecture to store meal representations and reason over the meal representations to support predictions. Results on a real-world dataset show that our method outperforms two strong image captioning baselines significantly.

Cite

Text

Liu et al. "Multi-Modal Multi-Task Learning for Automatic Dietary Assessment." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.11848

Markdown

[Liu et al. "Multi-Modal Multi-Task Learning for Automatic Dietary Assessment." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/liu2018aaai-multi-a/) doi:10.1609/AAAI.V32I1.11848

BibTeX

@inproceedings{liu2018aaai-multi-a,
  title     = {{Multi-Modal Multi-Task Learning for Automatic Dietary Assessment}},
  author    = {Liu, Qi and Zhang, Yue and Liu, Zhenguang and Yuan, Ye and Cheng, Li and Zimmermann, Roger},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {2347-2354},
  doi       = {10.1609/AAAI.V32I1.11848},
  url       = {https://mlanthology.org/aaai/2018/liu2018aaai-multi-a/}
}