Associative Embedding: End-to-End Learning for Joint Detection and Grouping

Abstract

We introduce associative embedding, a novel method for supervising convolutional neural networks for the task of detection and grouping. A number of computer vision problems can be framed in this manner including multi-person pose estimation, instance segmentation, and multi-object tracking. Usually the grouping of detections is achieved with multi-stage pipelines, instead we propose an approach that teaches a network to simultaneously output detections and group assignments. This technique can be easily integrated into any state-of-the-art network architecture that produces pixel-wise predictions. We show how to apply this method to multi-person pose estimation and report state-of-the-art performance on the MPII and MS-COCO datasets.

Cite

Text

Newell et al. "Associative Embedding: End-to-End Learning for Joint Detection and Grouping." Neural Information Processing Systems, 2017.

Markdown

[Newell et al. "Associative Embedding: End-to-End Learning for Joint Detection and Grouping." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/newell2017neurips-associative/)

BibTeX

@inproceedings{newell2017neurips-associative,
  title     = {{Associative Embedding: End-to-End Learning for Joint Detection and Grouping}},
  author    = {Newell, Alejandro and Huang, Zhiao and Deng, Jia},
  booktitle = {Neural Information Processing Systems},
  year      = {2017},
  pages     = {2277-2287},
  url       = {https://mlanthology.org/neurips/2017/newell2017neurips-associative/}
}