Recognition and Localization of Food in Cooking Videos

Abstract

In this paper, we describe experiments with techniques for locating foods and recognizing food states in cooking videos. We describe production of a new data set that provides annotated images for food types and food states. We compare results with two techniques for detecting food types and food states, and then show that recognizing type and state with separate classifiers improves recognition results. We then use this to provide detection of composite activation maps for food types. The results provide a promising first step towards construction of narratives for cooking actions.

Cite

Text

Bakr et al. "Recognition and Localization of Food in Cooking Videos." International Joint Conference on Artificial Intelligence, 2018. doi:10.1145/3230519.3230590

Markdown

[Bakr et al. "Recognition and Localization of Food in Cooking Videos." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/bakr2018ijcai-recognition/) doi:10.1145/3230519.3230590

BibTeX

@inproceedings{bakr2018ijcai-recognition,
  title     = {{Recognition and Localization of Food in Cooking Videos}},
  author    = {Bakr, Nachwa Abou and Ronfard, Rémi and Crowley, James L.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {21-24},
  doi       = {10.1145/3230519.3230590},
  url       = {https://mlanthology.org/ijcai/2018/bakr2018ijcai-recognition/}
}