Look Around and Learn: Self-Training Object Detection by Exploration

Abstract

When an object detector is deployed in a novel setting it often experiences a drop in performance. This paper studies how an embodied agent can automatically fine-tune a pre-existing object detector while exploring and acquiring images in a new environment without relying on human intervention, i.e., a fully self-supervised approach. In our setting, an agent initially learns to explore the environment using a pre-trained off-the-shelf detector to locate objects and associate pseudo-labels. By assuming that pseudo-labels for the same object must be consistent across different views, we learn the exploration policy “Look Around” to mine hard samples, and we devise a novel mechanism called “Disagreement Reconciliation” for producing refined pseudo-labels from the consensus among observations. We implement a unified benchmark of the current state-of-the-art and compare our approach with pre-existing exploration policies and perception mechanisms. Our method is shown to outperform existing approaches, improving the object detector by 6.2% in a simulated scenario, a 3.59% advancement over other state-of-the-art methods, and by 9.97% in the real robotic test without relying on ground-truth. Code for the proposed approach and baselines are available at https:// iit-pavis.github.io/Look_Around_And_Learn/.

Cite

Text

Scarpellini et al. "Look Around and Learn: Self-Training Object Detection by Exploration." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72992-8_5

Markdown

[Scarpellini et al. "Look Around and Learn: Self-Training Object Detection by Exploration." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/scarpellini2024eccv-look/) doi:10.1007/978-3-031-72992-8_5

BibTeX

@inproceedings{scarpellini2024eccv-look,
  title     = {{Look Around and Learn: Self-Training Object Detection by Exploration}},
  author    = {Scarpellini, Gianluca and Rosa, Stefano and Morerio, Pietro and Natale, Lorenzo and Del Bue, Alessio},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72992-8_5},
  url       = {https://mlanthology.org/eccv/2024/scarpellini2024eccv-look/}
}