Learning Program Representations for Food Images and Cooking Recipes
Abstract
In this paper, we are interested in modeling a how-to instructional procedure, such as a cooking recipe, with a meaningful and rich high-level representation. Specifically, we propose to represent cooking recipes and food images as cooking programs. Programs provide a structured representation of the task, capturing cooking semantics and sequential relationships of actions in the form of a graph. This allows them to be easily manipulated by users and executed by agents. To this end, we build a model that is trained to learn a joint embedding between recipes and food images via self-supervision and jointly generate a program from this embedding as a sequence. To validate our idea, we crowdsource programs for cooking recipes and show that: (a) projecting the image-recipe embeddings into programs leads to better cross-modal retrieval results; (b) generating programs from images leads to better recognition results compared to predicting raw cooking instructions; and (c) we can generate food images by manipulating programs via optimizing the latent code of a GAN. Code, data, and models are available online.
Cite
Text
Papadopoulos et al. "Learning Program Representations for Food Images and Cooking Recipes." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01606Markdown
[Papadopoulos et al. "Learning Program Representations for Food Images and Cooking Recipes." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/papadopoulos2022cvpr-learning/) doi:10.1109/CVPR52688.2022.01606BibTeX
@inproceedings{papadopoulos2022cvpr-learning,
title = {{Learning Program Representations for Food Images and Cooking Recipes}},
author = {Papadopoulos, Dim P. and Mora, Enrique and Chepurko, Nadiia and Huang, Kuan Wei and Ofli, Ferda and Torralba, Antonio},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2022},
pages = {16559-16569},
doi = {10.1109/CVPR52688.2022.01606},
url = {https://mlanthology.org/cvpr/2022/papadopoulos2022cvpr-learning/}
}