3D Object Detection and Tracking Refinement with Ensemble Methods and Spatiotemporal Filtering
Abstract
This paper introduces an innovative fusion strategy for 3D object detection, combining detection and segmentation inference for reduced false positive rates. The methodology involves a two-step approach: first we employ a geometric filter to the 3D segmentation outputs confined by the detected bounding boxes, and then utilize an ensemble technique to fuse and refine the detections. By integrating detailed local point-level characteristics from segmentation with the detector’s higher-level bounding box labels, our technique offers a lowered false positive rate. Several existing state-of-the-art methods encounter challenges due to the disparity between training on standardized datasets and real-world deployment. This approach often leads to results in degradation through increased false negatives, which impacts the tracker performance. To address these issues, we propose a novel spatiotemporal filtering algorithm called the A dd- D rop R e- I dentification T racker (ADRIT), which effectively rectifies tracking ID switches and detection drops without the need for consistent GPS/IMU information, thereby enhancing performance for tracking-by-detection approaches. In the validation set from the KITTI and NuScenes datasets, our proposed technique exhibits a reduction in false positives while achieving a modest 2–3% accuracy enhancement compared to the baseline model. Our refinement methods are robust against sensor errors where SOTA trackers fail to sustain their performance. Our code is available at github.com/sandeshrjain/3d-det-trk-refine/tree/main
Cite
Text
Jain et al. "3D Object Detection and Tracking Refinement with Ensemble Methods and Spatiotemporal Filtering." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91767-7_6Markdown
[Jain et al. "3D Object Detection and Tracking Refinement with Ensemble Methods and Spatiotemporal Filtering." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/jain2024eccvw-3d/) doi:10.1007/978-3-031-91767-7_6BibTeX
@inproceedings{jain2024eccvw-3d,
title = {{3D Object Detection and Tracking Refinement with Ensemble Methods and Spatiotemporal Filtering}},
author = {Jain, Sandesh Rajendra and Thapa, Surendrabikram and Bharadwaj, Sanjana and Sarkar, Abhijit and Abbott, A. Lynn and Xuan, Jianhua},
booktitle = {European Conference on Computer Vision Workshops},
year = {2024},
pages = {80-96},
doi = {10.1007/978-3-031-91767-7_6},
url = {https://mlanthology.org/eccvw/2024/jain2024eccvw-3d/}
}