Making Use of Unlabeled Data: Comparing Strategies for Marine Animal Detection in Long-Tailed Datasets Using Self-Supervised and Semi-Supervised Pre-Training
Abstract
This paper discusses strategies for object detection in marine images from a practitioner’s perspective working with real-world long-tail distributed datasets with a large amount of additional unlabeled data on hand. The paper discusses the benefits of separating the localization and classification stages, making the case for robustness in localization through the amalgamation of additional datasets inspired by a widely used approach by practitioners in the camera-trap literature. For the classification stage, the paper compares strategies to use additional unlabeled data, comparing supervised, supervised iteratively, self-supervised, and semi-supervised pre-training approaches. Our findings reveal that semi-supervised pre-training, followed by supervised fine-tuning, yields a significantly improved balanced performance across the long-tail distribution, albeit occasionally with a trade-off in overall accuracy. These insights are validated through experiments on two real-world long-tailed underwater datasets collected by the Monterey Bay Aquarium Research Institute (MBARI).
Cite
Text
Sharma et al. "Making Use of Unlabeled Data: Comparing Strategies for Marine Animal Detection in Long-Tailed Datasets Using Self-Supervised and Semi-Supervised Pre-Training." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00129Markdown
[Sharma et al. "Making Use of Unlabeled Data: Comparing Strategies for Marine Animal Detection in Long-Tailed Datasets Using Self-Supervised and Semi-Supervised Pre-Training." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/sharma2024cvprw-making/) doi:10.1109/CVPRW63382.2024.00129BibTeX
@inproceedings{sharma2024cvprw-making,
title = {{Making Use of Unlabeled Data: Comparing Strategies for Marine Animal Detection in Long-Tailed Datasets Using Self-Supervised and Semi-Supervised Pre-Training}},
author = {Sharma, Tarun and Cline, Danelle E. and Edgington, Duane},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2024},
pages = {1224-1233},
doi = {10.1109/CVPRW63382.2024.00129},
url = {https://mlanthology.org/cvprw/2024/sharma2024cvprw-making/}
}