City-Scale Multi-Camera Vehicle Tracking by Semantic Attribute Parsing and Cross-Camera Tracklet Matching

Abstract

This paper focuses on the Multi-Target Multi-Camera Tracking (MTMCT) task in a city-scale multi-camera network. As the trajectory of each target is naturally split into multiple sub-trajectories (namely local tracklets) in different cameras, the key issue of MTMCT is how to match local tracklets belonging to the same target across different cameras. To this end, we propose an efficient two-step MTMCT approach to robustly track vehicles in a camera network. It first generates all local tracklets and then matches the ones belonging to the same target across different cameras. More specifically, in the local tracklet generation phase, we follow the tracking-by-detection paradigm and link the detections to local tracklets by graph clustering. In the cross-camera tracklet matching phase, we first develop a spatial-temporal attention mechanism to produce robust tracklet representations. We then prune false matching candidates by traffic topology reasoning and match tracklets across cameras using the recently proposed TRACklet-to-Target Assignment (TRACTA) algorithm. The proposed method is evaluated on the City-Scale Multi-Camera Vehicle Tracking task at the 2020 AI City Challenge and achieves the second-best results.

Cite

Text

He et al. "City-Scale Multi-Camera Vehicle Tracking by Semantic Attribute Parsing and Cross-Camera Tracklet Matching." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. doi:10.1109/CVPRW50498.2020.00296

Markdown

[He et al. "City-Scale Multi-Camera Vehicle Tracking by Semantic Attribute Parsing and Cross-Camera Tracklet Matching." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.](https://mlanthology.org/cvprw/2020/he2020cvprw-cityscale/) doi:10.1109/CVPRW50498.2020.00296

BibTeX

@inproceedings{he2020cvprw-cityscale,
  title     = {{City-Scale Multi-Camera Vehicle Tracking by Semantic Attribute Parsing and Cross-Camera Tracklet Matching}},
  author    = {He, Yuhang and Han, Jie and Yu, Wentao and Hong, Xiaopeng and Wei, Xing and Gong, Yihong},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2020},
  pages     = {2456-2465},
  doi       = {10.1109/CVPRW50498.2020.00296},
  url       = {https://mlanthology.org/cvprw/2020/he2020cvprw-cityscale/}
}