Long-Tailed Object Detection Pretraining: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction
Abstract
Pre-training plays a vital role in various vision tasks, such as object recognition and detection. Commonly used pre-training methods, which typically rely on randomized approaches like uniform or Gaussian distributions to initialize model parameters, often fall short when confronted with long-tailed distributions, especially in detection tasks. This is largely due to extreme data imbalance and the issue of simplicity bias. In this paper, we introduce a novel pre-training framework for object detection, called Dynamic Rebalancing Contrastive Learning with Dual Reconstruction (2DRCL). Our method builds on a Holistic-Local Contrastive Learning mechanism, which aligns pre-training with object detection by capturing both global contextual semantics and detailed local patterns. To tackle the imbalance inherent in long-tailed data, we design a dynamic rebalancing strategy that adjusts the sampling of underrepresented instances throughout the pre-training process, ensuring better representation of tail classes. Moreover, Dual Reconstruction addresses simplicity bias by enforcing a reconstruction task aligned with the self-consistency principle, specifically benefiting underrepresented tail classes. Experiments on COCO and LVIS v1.0 datasets demonstrate the effectiveness of our method, particularly in improving the mAP/AP scores for tail classes.
Cite
Text
Duan et al. "Long-Tailed Object Detection Pretraining: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction." Neural Information Processing Systems, 2024. doi:10.52202/079017-1272Markdown
[Duan et al. "Long-Tailed Object Detection Pretraining: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/duan2024neurips-longtailed/) doi:10.52202/079017-1272BibTeX
@inproceedings{duan2024neurips-longtailed,
title = {{Long-Tailed Object Detection Pretraining: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction}},
author = {Duan, Chen-Long and Li, Yong and Wei, Xiu-Shen and Zhao, Lin},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-1272},
url = {https://mlanthology.org/neurips/2024/duan2024neurips-longtailed/}
}