Preventing Catastrophic Forgetting Through Memory Networks in Continuous Detection

Abstract

Modern pre-trained architectures struggle to retain previous information while undergoing continuous fine-tuning on new tasks. Despite notable progress in continual classification, systems designed for complex vision tasks such as detection or segmentation still struggle to attain satisfactory performance. In this work, we introduce a memory-based detection transformer architecture to adapt a pre-trained DETR-style detector to new tasks while preserving knowledge from previous tasks. We propose a novel localized query function for efficient information retrieval from memory units, aiming to minimize forgetting. Furthermore, we identify a fundamental challenge in continual detection referred to as background relegation. This arises when object categories from earlier tasks reappear in future tasks, potentially without labels, leading them to be implicitly treated as background. This is an inevitable issue in continual detection or segmentation. The introduced continual optimization technique effectively tackles this challenge. Finally, we assess the performance of our proposed system on continual detection benchmarks and demonstrate that our approach surpasses the performance of existing state-of-the-art resulting in 5-7% improvements on MS-COCO and PASCAL-VOC on the task of continual detection. Code: https://github.com/GauravBh1010tt/MD-DETR

Cite

Text

Bhatt et al. "Preventing Catastrophic Forgetting Through Memory Networks in Continuous Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72907-2_26

Markdown

[Bhatt et al. "Preventing Catastrophic Forgetting Through Memory Networks in Continuous Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/bhatt2024eccv-preventing/) doi:10.1007/978-3-031-72907-2_26

BibTeX

@inproceedings{bhatt2024eccv-preventing,
  title     = {{Preventing Catastrophic Forgetting Through Memory Networks in Continuous Detection}},
  author    = {Bhatt, Gaurav and Sigal, Leonid and Ross, James},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72907-2_26},
  url       = {https://mlanthology.org/eccv/2024/bhatt2024eccv-preventing/}
}