Boosting Long-Tailed Object Detection via Step-Wise Learning on Smooth-Tail Data
Abstract
Real-world data tends to follow a long-tailed distribution, where the class imbalance results in dominance of the head classes during training. In this paper, we propose a frustratingly simple but effective step-wise learning framework to gradually enhance the capability of the model in detecting all categories of long-tailed datasets. Specifically, we build smooth-tail data where the long-tailed distribution of categories decays smoothly to correct the bias towards head classes. We pre-train a model on the whole long-tailed data to preserve discriminability between all categories. We then fine-tune the class-agnostic modules of the pre-trained model on the head class dominant replay data to get a head class expert model with improved decision boundaries from all categories. Finally, we train a unified model on the tail class dominant replay data while transferring knowledge from the head class expert model to ensure accurate detection of all categories. Extensive experiments on long-tailed datasets LVIS v0.5 and LVIS v1.0 demonstrate the superior performance of our method, where we can improve the AP with ResNet-50 backbone from 27.0% to 30.3% AP, and especially for the rare categories from 15.5% to 24.9% AP. Our best model using ResNet-101 backbone can achieve 30.7% AP, which suppresses all existing detectors using the same backbone.
Cite
Text
Dong et al. "Boosting Long-Tailed Object Detection via Step-Wise Learning on Smooth-Tail Data." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00639Markdown
[Dong et al. "Boosting Long-Tailed Object Detection via Step-Wise Learning on Smooth-Tail Data." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/dong2023iccv-boosting/) doi:10.1109/ICCV51070.2023.00639BibTeX
@inproceedings{dong2023iccv-boosting,
title = {{Boosting Long-Tailed Object Detection via Step-Wise Learning on Smooth-Tail Data}},
author = {Dong, Na and Zhang, Yongqiang and Ding, Mingli and Lee, Gim Hee},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {6940-6949},
doi = {10.1109/ICCV51070.2023.00639},
url = {https://mlanthology.org/iccv/2023/dong2023iccv-boosting/}
}