An Integrated YOLO and VLM System for Fire Detection in Enclosed Environments

Abstract

While YOLO models show promise in car fire detection, they remain insufficient for real-world deployment in confined parking environments due to dataset limitations, evaluation gaps, and deployment constraints. We first fine-tune YOLO on a fire/smoke-augmented dataset, but analysis reveals its struggles with ambiguous fire-smoke boundaries, leading to false predictions. To address this, we propose a real-time end-to-end framework integrating YOLOv8s with Florence2 VLM, combining object detection with contextual reasoning. While YOLOv8s with VLM improves detection reliability, challenges are still ongoing. Our findings highlight YOLO’s limitations in fire detection and the need for a more adaptive, environment-aware approach.

Cite

Text

Kim et al. "An Integrated YOLO and VLM System for Fire Detection in Enclosed Environments." ICLR 2025 Workshops: ICBINB, 2025.

Markdown

[Kim et al. "An Integrated YOLO and VLM System for Fire Detection in Enclosed Environments." ICLR 2025 Workshops: ICBINB, 2025.](https://mlanthology.org/iclrw/2025/kim2025iclrw-integrated/)

BibTeX

@inproceedings{kim2025iclrw-integrated,
  title     = {{An Integrated YOLO and VLM System for Fire Detection in Enclosed Environments}},
  author    = {Kim, Joanne and Lee, Yejin and Yoon, DongSik and Jung, Chansung and Lee, Gunhee},
  booktitle = {ICLR 2025 Workshops: ICBINB},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/kim2025iclrw-integrated/}
}