LOFI: LOng-Tailed FIne-Grained Network for Food Recognition

Abstract

Food recognition plays a crucial role in several healthcare applications. Nevertheless, it presents significant computer vision challenges such as long-tailed and fine-grained distributions that hinder its progress. In this work, we propose LOFI, a Long-tailed Fine-grained Network aimed specifically at tackling these food recognition challenges by improving the feature learning capabilities of food recognition models. Specifically, we improve vanilla R-CNN architecture by tailoring it for food recognition. We design an efficient multi-task framework for fine-grained food recognition, which exploits the lexical similarity of dishes during training to improve the discriminative ability of the network. Secondly, we include a Graph Confidence Propagation module based on graph neural networks to aggregate the information of overlapping detections and refine the final prediction of the network. Extensive analysis and ablations of different components of LOFI highlight that it successfully addresses the targeted problems and leads to noticeable gains in performance. Remarkably, the proposed method achieves competitive results and outperforms the current state-of-the-art methods in three public food benchmarks: UECFood-256, AiCrowd Food Challenge 2022, and UECFood-100 segmented.

Cite

Text

Rodríguez-de-Vera et al. "LOFI: LOng-Tailed FIne-Grained Network for Food Recognition." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00379

Markdown

[Rodríguez-de-Vera et al. "LOFI: LOng-Tailed FIne-Grained Network for Food Recognition." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/rodriguezdevera2024cvprw-lofi/) doi:10.1109/CVPRW63382.2024.00379

BibTeX

@inproceedings{rodriguezdevera2024cvprw-lofi,
  title     = {{LOFI: LOng-Tailed FIne-Grained Network for Food Recognition}},
  author    = {Rodríguez-de-Vera, Jesús M. and Estepa, Imanol G. and Bolaños, Marc and Nagarajan, Bhalaji and Radeva, Petia},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2024},
  pages     = {3750-3760},
  doi       = {10.1109/CVPRW63382.2024.00379},
  url       = {https://mlanthology.org/cvprw/2024/rodriguezdevera2024cvprw-lofi/}
}