The Effect of Pixel-Level Fusion on Object Tracking in Multi-Sensor Surveillance Video

Abstract

This paper investigates the impact of pixel-level fusion of videos from visible (VIZ) and infrared (IR) surveillance cameras on object tracking performance, as compared to tracking in single modality videos. Tracking has been accomplished by means of a particle filter which fuses a colour cue and the structural similarity measure (SSIM). The highest tracking accuracy has been obtained in IR sequences, whereas the VIZ video showed the worst tracking performance due to higher levels of clutter. However, metrics for fusion assessment clearly point towards the supremacy of the multiresolutional methods, especially Dual Tree-Complex Wavelet Transform method. Thus, a new, tracking-oriented metric is needed that is able to accurately assess how fusion affects the performance of the tracker.

Cite

Text

Cvejic et al. "The Effect of Pixel-Level Fusion on Object Tracking in Multi-Sensor Surveillance Video." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007. doi:10.1109/CVPR.2007.383433

Markdown

[Cvejic et al. "The Effect of Pixel-Level Fusion on Object Tracking in Multi-Sensor Surveillance Video." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007.](https://mlanthology.org/cvpr/2007/cvejic2007cvpr-effect/) doi:10.1109/CVPR.2007.383433

BibTeX

@inproceedings{cvejic2007cvpr-effect,
  title     = {{The Effect of Pixel-Level Fusion on Object Tracking in Multi-Sensor Surveillance Video}},
  author    = {Cvejic, Nedeljko and Nikolov, Stavri G. and Knowles, Henry D. and Loza, Artur and Achim, Alin and Bull, David R. and Canagarajah, Cedric Nishan},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2007},
  doi       = {10.1109/CVPR.2007.383433},
  url       = {https://mlanthology.org/cvpr/2007/cvejic2007cvpr-effect/}
}